Session 21 — How Machines Read
Duration: 75 min · Format: live online · Ages: 12–15
Session goal: by the end, students can explain how a sentence is split into tokens and counted into numbers, build and test a small sentiment classifier in Python, and name real limits like sarcasm and unseen words.
Before class — prep (5 min)
- Open Google Colab → New notebook, ready to screen-share. You'll build the text classifier live. (scikit-learn is already in Colab — no setup.)
- Reminder for yourself: last session images became rows of numbers; today words become numbers too, then it's the same
.fit()/.predict()recipe from Unit 1. - Optional: if you want the wow-moment at the end, have the Hugging Face
pipelinedemo (in Going deeper) ready — but it needs a one-timepip install, so test it beforehand.
Agenda
| Time | Segment |
|---|---|
| 0:00 | Hook — how does a phone know a review is happy? (5 min) |
| 0:05 | Teach — text becomes tokens, then numbers (14 min) |
| 0:19 | Teach — a bag of words can be classified (13 min) |
| 0:32 | Activity — build a sentiment classifier in Colab (26 min) |
| 0:58 | Check for understanding (10 min) |
| 1:08 | Wrap-up + homework (7 min) |
0:00 · Hook (5 min)
Ask the class and take a few answers (chat or unmute):
- "An app flags a review as happy or angry before a human reads it. How could a computer possibly 'read' the feeling?"
- "A computer only does math on numbers. So how do you turn the sentence 'I love this' into numbers?"
Land it: computers can't read words — but they can count them. Today they'll turn sentences into numbers and train a model to tell happy text from unhappy text — then find exactly where it gets fooled.
0:05 · Teach — Text becomes tokens, then numbers (14 min)
Explain: the first step in every language model is tokenizing — chopping text into pieces called tokens (here, simply words). Then each token becomes a number the computer can count.
Share this diagram so students can follow how text is split into tokens, counted into numbers, and read by a model that predicts the mood:
Type/run this together in Colab:
text = "I really love this movie"
tokens = text.lower().split() # lowercase, then split on spaces
print(tokens) # ['i', 'really', 'love', 'this', 'movie']
print("Number of tokens:", len(tokens))
Explain each move: lower() so Love and love count as the same word, split() to break on spaces. Now show how a computer turns a whole set of sentences into a table of word counts — the "bag of words":
from sklearn.feature_extraction.text import CountVectorizer
texts = ["I love this", "I hate this"]
vectorizer = CountVectorizer()
counts = vectorizer.fit_transform(texts)
print("Words it found:", vectorizer.get_feature_names_out())
print(counts.toarray()) # one row per sentence, one column per word
Walk through the grid: each column is a word, each row is a sentence, each number is how many times that word appeared. The sentence is now just numbers.
Ask: "Why lowercase everything first?" (Answer: so Love, love, and LOVE are treated as the same word instead of three different ones.)
⚠ Watch for the #1 misconception: students think the model understands the words. It doesn't — it only counts them. It has no idea what "love" means; it just learns that the count of certain words goes with "positive."
0:19 · Teach — A bag of words can be classified (13 min)
Explain: once every sentence is a row of word-counts, text classification is the same train/test/.fit() recipe from Unit 1 — the features are just word counts instead of pixels. We give the model labelled examples (positive / negative) and it learns which words lean which way.
Type/run this together in Colab:
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.linear_model import LogisticRegression
texts = ["I love this", "This is great", "Absolutely wonderful", "Best day ever",
"I hate this", "This is terrible", "So boring", "Worst day ever"]
labels = ["positive", "positive", "positive", "positive",
"negative", "negative", "negative", "negative"]
vectorizer = CountVectorizer()
X = vectorizer.fit_transform(texts) # words -> numbers (the features)
model = LogisticRegression()
model.fit(X, labels) # learn which words lean positive/negative
print("Trained on", len(texts), "examples.")
Explain that X is the bag-of-words table and labels is the target — identical shape to every model they've built.
Ask: "This model saw only 8 tiny sentences. Do you trust it yet?" (Answer: no — far too little data; a great honesty setup for the activity.)
⚠ Watch for: students assume more clever wording helps the model. What helps is more, varied, labelled examples — the same data lesson from Unit 1, now for text.
0:32 · Activity — Build a sentiment classifier (26 min)
Have students open their own Google Colab → New notebook, build the classifier above, then test it and try to break it. Screen-share and build line by line.
Type/run this together in Colab — predict on brand-new sentences:
new_texts = ["I really love this movie", "This was so boring"]
new_X = vectorizer.transform(new_texts) # SAME vectorizer, don't refit
print(model.predict(new_X))
Point out the crucial detail: use vectorizer.transform (not fit_transform) on new text, so it uses the same word columns it learned. Then turn students loose to stress-test it and report in the chat:
- Try sentences that should be positive/negative and see if it agrees.
- Try to fool it with sarcasm:
"Oh great, another rainy day". Ask: "What does it say, and is it right? Why does it fail?" - Try a word the model never saw, like
"This is fantastic"(iffantasticwasn't in training). Ask: "What happens to an unknown word?" (Answer: it's ignored — the model has no column for it.)
Then measure honestly. Have them see that unknown words simply vanish:
mystery = vectorizer.transform(["This is fantastic and superb"])
print(mystery.toarray()) # likely all zeros — none of those words were learned
Circulate for the classic mistakes: calling fit_transform on new text (which re-learns the vocabulary and breaks alignment) and expecting the model to handle words it never trained on.
0:58 · Check for understanding (10 min)
Ask these aloud or drop them in the chat. Answer key (for you):
- What is a token, and what's the first step to "read" text? → A token is a piece of text (here, a word); the first step is tokenizing — splitting text into tokens.
- How does "I love this" become numbers? → Bag of words — count how many times each known word appears; each count is a feature.
- Name one honest limit of this model. → e.g. sarcasm, unknown words it never saw, tiny/biased training data, or it ignores word order.
1:08 · Wrap-up + homework (7 min)
- Ask one student to explain, in their own words, why the model doesn't actually understand the sentence.
- Homework — Break your classifier: find 3 sentences your model gets wrong. For each, write one line on why — sarcasm? an unknown word? word order? Then write one sentence: what data would you add to fix it? Bring it to Session 22 — next session you pick vision or text and build your own project.
Teaching notes
- Correct this misconception: "the model understands language." It only counts words; it has no meaning, no context, no idea what a word refers to.
fit_transformvstransform: callfit_transformonce on the training texts to learn the vocabulary, thentransformon all new text to reuse the same columns. Refitting on new text silently breaks the alignment — flag it before they hit it.- Word order is thrown away: "dog bites man" and "man bites dog" produce the identical bag of words. Mention this as a real limitation and a reason more advanced models (which read order) exist.
- Fast finishers (extension) — measure it, then peek inside: real evaluators split text data and check accuracy too. Have them build a bigger labelled list (12–20 sentences), do a train/test split, and print accuracy — then read which words the model treats as most positive/negative:
import numpy as np
words = vectorizer.get_feature_names_out()
weights = model.coef_[0] # how each word pushes the label
order = np.argsort(weights)
print("Most negative words:", words[order[:3]])
print("Most positive words:", words[order[-3:]])
Ask whether the learned "positive" and "negative" words make sense — and what a weird one reveals about small, biased data (a word looks positive only because it happened to sit in positive examples). This ties straight back to Unit 1's bias lesson. - Low-tech fallback: if devices can't run Colab, do bag-of-words on the shared screen — tally word counts for two happy and two angry sentences by hand, then have students "predict" a new sentence by which word-counts it matches. Reveal that scikit-learn does exactly this counting.
Vocabulary
| Term | Meaning |
|---|---|
| Token | A piece of text, usually a word |
| Tokenize | Split text into tokens |
| Bag of words | Counting how often each word appears, ignoring order |
| Sentiment | Whether text is positive or negative |
| Vectorizer | The tool that turns text into number counts |
Resources
- Google Colab — where you build it all (free).
- scikit-learn — text feature extraction — how
CountVectorizerworks. - Hugging Face — pipelines — a free, one-line sentiment model (see Going deeper).
- Kaggle — Natural Language Processing — free next-step lessons.
Practice set
A mix of concept questions and short coding tasks on tokens, bag of words, and honest limits — easy to hard. Use for lab time or homework.
1. Define it: what does it mean to tokenize a sentence? → Split it into pieces (tokens) — here, individual words.
2. Predict the output: what does this print? → ['i', 'love', 'pizza'] — lowercased and split on spaces.
print("I Love pizza".lower().split())
3. Reasoning: why do we lower() text before counting words? → So Love, love, and LOVE count as the same word, not three different ones.
4. Read the bag: for the sentences ["good good movie", "bad movie"], the word movie appears in both. In the counts table, what number sits in the movie column for each row? → 1 and 1 — it appears once in each sentence.
5. Fix the bug: why does predicting on new text with fit_transform misbehave? → fit_transform re-learns the vocabulary from the new text, breaking alignment with the trained model; use vectorizer.transform(...) instead.
new_X = vectorizer.fit_transform(["I love this"]) # wrong on new text
print(model.predict(new_X))
6. Reasoning (harder): the model gets "Oh great, another Monday" wrong and calls it positive. Why? → It counts the positive word great and can't detect sarcasm — it has no sense of tone or context.
7. Reasoning (hardest): "dog bites man" and "man bites dog" get the exact same bag of words. What limitation does this reveal, and why does it matter? → Bag of words ignores order, so it can't tell who did what — meaning that depends on order is lost.
Going deeper (optional)
For a class that's flying, show a modern model that does handle unseen words and some context — a pretrained sentiment model in one line with Hugging Face. It's free but downloads a model the first time, so run it once yourself before class:
!pip install -q transformers
from transformers import pipeline
classifier = pipeline("sentiment-analysis")
print(classifier("I really love this movie"))
print(classifier("Oh great, another rainy day")) # try to fool it too
Contrast it honestly with their own model: this one was trained on millions of examples, so it knows far more words and some tone — but it's still not perfect (test the sarcasm line and see). Land the lesson: bigger training data buys more coverage, but no text model truly understands — they all have limits worth naming. This is exactly the honesty mindset for their Session 22 project.
Common mistakes & fixes
- Mistake: believing the model understands the words. → Fix: it only counts them; it learns which counts go with which label, nothing more.
- Mistake: calling
fit_transformon new text. → Fix:fit_transformonce on training data, thentransformon new text so the word columns stay aligned. - Mistake: expecting it to handle words it never trained on. → Fix: unknown words have no column, so they're ignored — add them to the training data to teach them.
- Mistake: trusting it on sarcasm or tone. → Fix: bag of words has no sense of tone; sarcasm regularly fools it — name this as a real limit.
- Mistake: thinking word order is captured. → Fix: bag of words ignores order — "dog bites man" equals "man bites dog" to the model.
Next session
Session 22 — Your AI Mini-Project & Showcase: students pick vision or text, build their own small classifier, evaluate it honestly, and present it — the build project for this unit.