⏱ 75 minLive session · ages 12–15

Session 22 — Your AI Mini-Project & Showcase

Duration: 75 min · Format: live online · Ages: 12–15

Session goal: by the end, students have built their own small image or text classifier, measured it honestly on data it never saw, named its limits, and presented it to the class in a clear structure.

Before class — prep (5 min)

Open Google Colab → New notebook, ready to screen-share both starter templates below.
Have the two starter templates (vision and text) pasted into a doc or the chat so students can grab whichever they pick.
Reminder for yourself: this is a build session — keep teaching short and protect the build and present time. Students already have everything they need from Sessions 19–21.
Decide the running order for the showcase in advance so every student knows their slot.

Agenda

Time	Segment
0:00	Hook — you're the builder now (4 min)
0:04	Teach — pick your track + the honesty checklist (11 min)
0:15	Build — your mini-project in Colab (35 min)
0:50	Showcase — present your project (18 min)
1:08	Check for understanding + wrap-up (7 min)

0:00 · Hook (4 min)

Ask the class and take a couple of quick answers:

"For four sessions you learned how machines see and read. Today you build one. Vision or text — which is calling you?"
"What's one real thing you'd love a small classifier to sort — doodles, movie reviews, spam, emojis, plant photos?"

Land it fast: today they're the builder, not the audience. The goal isn't a perfect model — it's a working one they can explain honestly, limits and all.

0:04 · Teach — Pick your track + the honesty checklist (11 min)

Explain the two tracks briefly — both reuse code they already wrote:

Vision track — classify images (the digits from Session 20, or their own two-category Teachable Machine model exported and demoed).
Text track — classify text (a sentiment classifier like Session 21, on their own labelled sentences: reviews, messages, spam vs not).

Share the honesty checklist on your screen — every project must answer all five:

Question — what are you classifying, and into what categories?
Data — where did your examples come from, how many, and is it balanced?
Method — what turns your input into numbers, and what model did you train?
Results — your test-set accuracy, in a number.
Limits — one clear example it gets wrong, and why.

Key point to land: point 5 is not optional and not a weakness. Naming what your model gets wrong is what makes you a trustworthy builder — the whole message of this unit.

⚠ Watch for: students who want to hide or skip the failures. Reframe it — a project that honestly shows its limits beats one that pretends to be perfect. Every real AI has limits; the skill is knowing yours.

0:15 · Build — Your mini-project in Colab (35 min)

Have students open Google Colab → New notebook, pick a track, and start from the matching template. Circulate (or work the chat) and help them get to a working model fast, then push them to the honest evaluation.

Your project moves through a cycle: plan, build, test, improve, present — your AI mini-project cycle.

Diagram of the AI mini-project cycle: plan, build, test, improve, and present

Vision starter (Type/run this together in Colab):

from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix

digits = load_digits()
X, y = digits.data, digits.target

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=1)

model = LogisticRegression(max_iter=10000)
model.fit(X_train, y_train)

preds = model.predict(X_test)
print("Accuracy:", accuracy_score(y_test, preds))     # your Results number
print(confusion_matrix(y_test, preds))                # find a confusion to explain

Text starter (Type/run this together in Colab):

from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

# Replace with YOUR labelled examples — aim for 16+ and keep it balanced
texts  = ["I love this", "So much fun", "Absolutely great", "Best ever",
          "I hate this", "So boring", "Really terrible", "Worst ever"]
labels = ["positive", "positive", "positive", "positive",
          "negative", "negative", "negative", "negative"]

X_train, X_test, y_train, y_test = train_test_split(
    texts, labels, test_size=0.25, random_state=1)

vectorizer = CountVectorizer()
X_train_v = vectorizer.fit_transform(X_train)
X_test_v  = vectorizer.transform(X_test)              # SAME vectorizer

model = LogisticRegression()
model.fit(X_train_v, y_train)
print("Accuracy:", accuracy_score(y_test, model.predict(X_test_v)))

Push every student to do these three things (this is the real work):

Make it theirs — change the data (their own sentences, or a Teachable Machine model), the categories, or the test example.
Get the Results number — print test-set accuracy. No number, no project.
Find one honest failure — a misread image or a fooled sentence — and be ready to say why.

Circulate for the recurring traps from earlier sessions: measuring on training data (must be X_test), reshape(8, 8) for showing a flattened image, and fit_transform vs transform for text.

0:50 · Showcase — Present your project (18 min)

Have each student give a 60–90 second talk following the five-point checklist, sharing their Colab screen for a quick live demo. Keep it moving; hold a short round of applause after each.

Coach them to hit all five out loud: Question → Data → Method → Results (the number) → Limits (the honest failure).

As each presents, ask one honesty question, for example:

"What's one input it would get wrong?"
"Was your data balanced, or did one category have more examples?"
"Would you trust this for something important? Why or why not?"

Land the pattern across talks: the best projects weren't the ones with the highest accuracy — they were the ones whose builder could clearly explain what it does, how well, and where it breaks.

1:08 · Check for understanding + wrap-up (7 min)

Ask these aloud or drop them in the chat. Answer key (for you):

Why must you report test-set accuracy, not training accuracy? → Training accuracy is data the model already saw; only the held-out test set honestly shows how it does on new inputs.
Why is naming a limitation part of a good project, not a flaw? → Every real model has limits; stating them makes your work trustworthy and shows you understand it.
What's the shared recipe behind both tracks? → Turn the input (image or text) into numbers, then train / test / .fit / .predict — the same pattern from Unit 1.
Congratulate them: they've built and honestly evaluated a real AI classifier — vision or language — from scratch.
Keep for your portfolio: save the Colab notebook and a screenshot of the Results number and one honest failure — it's proof of a complete, honest project.

Teaching notes

Protect the build time: the teaching slot is deliberately short. If discussion runs long, cut it — students learn most by building and presenting today.
Common blockers to pre-empt: measuring accuracy on training data; forgetting reshape(8, 8) to show an image; using fit_transform on test/new text; too few or unbalanced text examples so accuracy is meaningless. A model that's "100% accurate" on 8 sentences is a teaching moment, not a success.
Two honest project sizes are fine: a student who only reuses the digits or sample-sentence starter but nails the five-point honest write-up has met the goal. Customisation is a bonus, not a requirement.
Fast finishers (extension) — the fairness/limits paragraph: have them write a short "limitations" paragraph like a real project report: name one group or input type the model saw little data about, one situation it would fail in, and one concrete way they'd fix it (more/balanced data, more categories, a model that reads word order). This mirrors the honest project report from Unit 1 and is exactly what the Projects & Assessment section rewards.
Low-tech fallback: if a student can't run Colab, they can present a Teachable Machine model (webcam, no code) or walk through a paper version — the five-point honest checklist still applies. Anyone without a device presents on your shared screen.

Vocabulary

Term	Meaning
Mini-project	A small, complete build you can demo and explain
Evaluation	Measuring honestly how well a model does
Limitation	A situation where the model fails or is unreliable
Balanced data	Roughly equal examples per category
Demo	Showing your model working live

Resources

Google Colab — where you build and demo (free).
Google — Teachable Machine — a no-code image classifier for the vision track.
scikit-learn — user guide — reference for .fit, .predict, and metrics.
Kaggle — free datasets — real image and text data for a bigger version later.

Practice set

Planning and honesty tasks to sharpen the project — do these while building or as a write-up. Answers are for you, after the arrow.

1. Frame it: in one sentence each, state your project's Question and its categories. → e.g. "Is a movie review positive or negative?" — categories: positive, negative.

2. Balance check: you have 20 positive and 4 negative examples. What's the risk, and the fix? → The data is unbalanced, so the model leans positive and may look accurate while failing on negatives; add more negative examples.

3. Spot the cheat: a classmate reports 100% accuracy measured on the same data they trained on. Is it trustworthy? → No — that's the training set; report accuracy on the held-out test set.

4. Fix the bug (text): why is this wrong for the test set? → fit_transform re-learns the vocabulary from the test text; use vectorizer.transform(X_test) so the columns match the trained model.

X_test_v = vectorizer.fit_transform(X_test)   # wrong

5. Write your Results line: you have preds and y_test. Write the line that prints your accuracy. → from sklearn.metrics import accuracy_score then print(accuracy_score(y_test, preds)).

6. Name a limit (harder): give one specific input your model would get wrong and explain why. → e.g. a sarcastic review ("great, another delay") — bag of words counts "great" and misses the tone; or a messy 4 that looks like a 9.

7. Design question (hardest): you want to grow this into a trustworthy tool. Name two concrete improvements. → e.g. collect more balanced data; add more categories; keep a human in charge of important decisions; use a model that reads word order or richer image features; report per-category accuracy, not just overall.

Going deeper (optional)

For a class that finishes early, have them combine both worlds or measure a model per-category to make their honesty concrete. Print the score for each category so a hidden weak spot can't hide behind a good overall number:

from sklearn.metrics import classification_report

# for the text project (preds and y_test already exist)
print(classification_report(y_test, model.predict(X_test_v)))

Have them read the report and answer: which category does the model handle worst, and is the training data for that category smaller or messier? Land the closing lesson of the whole unit — a single accuracy number can flatter a model, but breaking the score down by category (and looking at real failures) is what an honest builder does. Challenge them to write the one improvement they'd make first, backed by what the report shows.

Common mistakes & fixes

Mistake: reporting accuracy measured on the training data. → Fix: report the test-set number — the only honest measure of new performance.
Mistake: skipping or hiding the limitations. → Fix: every project must name one real failure and why — that's what makes it trustworthy, not weaker.
Mistake: unbalanced or tiny data giving a meaningless score. → Fix: aim for enough balanced examples per category; a great score on 8 items proves little.
Mistake (text): using fit_transform on new or test text. → Fix: fit_transform once on training data, then transform everywhere else.
Mistake (vision): trying to show a flattened row as an image. → Fix: reshape(8, 8) before imshow — 64 numbers must fold back into a grid.

Next session

This is the final learning session of Course 2 — students move to the Projects & Assessment section, where they pull everything together into a capstone project and earn their certificate.