Ibnovate Course 2 · The Rising Builders
⏱ 75 minLive session · ages 12–15

Session 3 — Your First Prediction

Duration: 75 min · Format: live online · Ages: 12–15

Session goal: by the end, students can explain how a model learns a pattern, why we split data into training and test sets, and can build and run a real prediction with scikit-learn.

Before class — prep (5 min)

Agenda

Time Segment
0:00 Hook — you already spot patterns (5 min)
0:05 Teach — a model learns the pattern, then predicts (13 min)
0:18 Teach — train, then test (no cheating) (13 min)
0:31 Activity — build a real predictor in Colab (27 min)
0:58 Check for understanding (10 min)
1:08 Wrap-up + homework (7 min)

0:00 · Hook (5 min)

Ask the class and take a few answers (chat or unmute):

Land the idea: their brain already sees the pattern. A model is just a program that finds that pattern in data and uses it to predict new cases. Tell them that today they'll make a computer predict — not magic, just patterns and math.


0:05 · Teach — A model learns the pattern, then predicts (13 min)

Explain: give a model some data points, and it finds the line — the pattern — that fits them. Then it can predict a new value by reading off that line.

Share this diagram and name the three parts:

A scatter of points, a model line through them, and a predicted new point

Ask: "If all the dots sat perfectly on one straight line, how confident would the prediction be? What if they were scattered everywhere?" (Answer: a tight line means a strong, reliable pattern; a scattered cloud means a weak one.)

⚠ Watch for: students may think the model "memorises" the answers. It doesn't store the dots — it learns a general rule (the line) it can apply to inputs it has never seen.


0:18 · Teach — Train, then test — no cheating (13 min)

Explain: we never test a model on the same data it learned from — that's like handing it the exam answers first. We split the data into two parts.

Share this diagram and walk through the split:

Data split into 80% training and 20% test, feeding a model that scores 92% accuracy

Ask: "Why would testing on the training data give a misleadingly high score?" (Answer: the model has already seen those answers, so it can look perfect without having learned a general pattern.)

⚠ Watch for: students want to judge a model by how well it does on data it trained on. Stress that the test set — data it has never seen — is the only honest measure.


0:31 · Activity — Build a real predictor (27 min)

Have students open Google Colab and build a working predictor. Screen-share and build it line by line with them.

Type/run this together in Colab:

from sklearn.linear_model import LinearRegression

# our data: hours studied  ->  test score
hours  = [[1], [2], [3], [4], [5]]
scores = [52, 60, 71, 79, 90]

model = LinearRegression()
model.fit(hours, scores)          # 1) learn the pattern

prediction = model.predict([[6]]) # 2) predict for 6 hours
print("Predicted score:", prediction[0])

Circulate for the double-brackets confusion — hours is a list of lists ([[1], [2], …]), and predict also needs [[6]], not [6]. This is the error most students hit.

Explain: "You just trained a machine learning model in six lines. That .fit() step is the learning."


0:58 · Check for understanding (10 min)

Ask these aloud or drop them in the chat. Answer key (for you):

  1. Why do we keep a separate test set? → To check if the model really learned — not just memorised. Testing on the training data would be cheating.
  2. In the code, which line is the "learning" step?model.fit(hours, scores).fit() finds the pattern in the data.
  3. What does higher accuracy mean? → The model is right more often on data it hasn't seen before.

1:08 · Wrap-up + homework (7 min)


Teaching notes

```python from sklearn.model_selection import train_test_split from sklearn.tree import DecisionTreeRegressor from sklearn.linear_model import LinearRegression from sklearn.metrics import mean_absolute_error

# X = features, y = target (use a real dataset with several rows) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1)

for model in [LinearRegression(), DecisionTreeRegressor()]: model.fit(X_train, y_train) preds = model.predict(X_test) print(type(model).name, "error:", mean_absolute_error(y_test, preds)) ```

Explain that lower error = better — ask which model wins on their data. Challenge them to change test_size or random_state and note whether the winner changes (their first real experiment). - Low-tech fallback: if devices can't run Colab, plot the five (hours, scores) points on a shared grid, draw the best-fit line by hand, and read off the prediction for 6 hours — then show that scikit-learn does exactly this, only with math.

Vocabulary

Term Meaning
Model A program that learns a pattern to predict
Train / Fit Teaching the model with data (.fit)
Test set Hidden data used to check the model
Feature An input used to predict (e.g. hours)
Accuracy How often the model is right

Resources

Next session

Session 4 — What AI Can (and Can't) Do: before going further, the most important lesson of the unit — using this power responsibly and fairly.

Ibnovate · Build · Innovate
Type to search · Esc to close
Welcome back
Sign in to continue building.
Accounts are created by Ibnovate — ask your instructor for your login.
🔒