Ibnovate Course 3 · The Future Builders
⏱ 75 minLive session

Session 13 — An End-to-End ML Project

Duration: 75 min · Format: live online

What you'll learn: by the end, you can run a complete machine-learning project from start to finish on a real dataset — get the data, clean and explore it, train a model, evaluate it honestly, and decide whether it's ready to deploy — using pandas and scikit-learn in Colab.

Soft skill focus — Problem-solving

Today you'll also grow Problem-solving. A real project never arrives clean. Columns are missing, the accuracy is disappointing, the code throws an error you've never seen. The skill isn't knowing every answer in advance — it's breaking a messy, scary problem into small steps you can solve, one at a time.

What you'll need

Hook

Everything you've built so far in this course was one piece of the puzzle: a neuron, a CNN, a transformer, an evaluation. But a real machine-learning project is a pipeline — a chain where the messy early stages quietly decide whether the fancy model at the end has any chance.

Here's the secret professionals know: the model is often the easy part. Getting good data, cleaning it, and understanding it is where most of the real work — and most of the winning or losing — happens. Today you run the whole chain yourself, end to end, on a real dataset.

An end-to-end machine-learning pipeline: get data, clean and explore, train, evaluate, deploy, then loop

Teach — The five stages of a real project

Every serious ML project moves through the same stages. Learn the shape once and you can apply it to any dataset forever.

  1. Get data — load it and look at its actual shape: how many rows, how many columns, what does each column mean?
  2. Clean & explore — handle missing values, fix types, and explore (EDA — exploratory data analysis): which features seem to matter? This stage usually takes the most time.
  3. Train — split into train/test, pick a model, and fit it on the training half only.
  4. Evaluate — measure honestly on the test half the model never saw. Is it actually good, or just lucky?
  5. Deploy (and loop) — if it's good enough, ship it so people can use it (that's next session) — then loop back with what you learned.

⚠ Watch out: the single biggest mistake in a whole pipeline is letting the test set leak into training — cleaning or fitting using the test data, then evaluating on it. Your numbers will look amazing and mean nothing. Split first, then only ever look at the test set to score. Treat it like a sealed exam.

Teach — Cleaning is the real job

When people imagine ML, they picture the training line. In reality you'll spend most of a project here:

Do this stage well and a simple model shines. Skip it and no amount of deep learning saves you.

Activity — Run the whole pipeline

You'll use the classic Titanic dataset (who survived the shipwreck) — small, real, and perfect for one session. Open a new Colab notebook and go stage by stage.

Stage 1 — Get the data. Load it straight from a URL and look at it:

import pandas as pd

url = "https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv"
df = pd.read_csv(url)

print("shape (rows, cols):", df.shape)
df.head()

Stage 2 — Explore, then clean. First look, then decide:

print(df.isnull().sum())          # how many gaps per column?
print(df.groupby("Sex")["Survived"].mean())   # explore: does Sex matter?

# clean: fill missing Age with the median, drop columns we won't use
df["Age"] = df["Age"].fillna(df["Age"].median())
df = df[["Survived", "Pclass", "Sex", "Age", "Fare"]].dropna()

# turn the Sex category into numbers (male=0, female=1)
df["Sex"] = df["Sex"].map({"male": 0, "female": 1})
df.head()

Stage 3 — Split, then train (split before you fit — no leakage):

from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier

X = df.drop("Survived", axis=1)   # features
y = df["Survived"]                # the answer we want to predict

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42)

model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)       # fit on TRAIN only

Stage 4 — Evaluate honestly on the untouched test set:

from sklearn.metrics import accuracy_score, confusion_matrix

preds = model.predict(X_test)
print("accuracy:", accuracy_score(y_test, preds))
print(confusion_matrix(y_test, preds))

Now investigate:

  1. What accuracy did you get? (Around 0.80 is normal here.) Is that good, given that always guessing "did not survive" scores about 0.62?
  2. Ask the model what it learned: run dict(zip(X.columns, model.feature_importances_)). Which feature mattered most? Does it match what your groupby in Stage 2 hinted at?

You just ran a full, real ML pipeline — the exact shape of every project a data scientist ships.

Check yourself

  1. Why do you split before cleaning-with-statistics or training? → To prevent data leakage — if the test set influences training, your score is fake. The test set must stay unseen until you score.
  2. Which stage usually takes the most time?Clean & explore — real data is messy, and understanding it well is what makes the model work.
  3. Why compare your accuracy to "always guess the most common class"? → That's the baseline. If your model can't beat it, it hasn't learned anything useful yet.

Wrap-up

You now have the whole map: get → clean & explore → train → evaluate → deploy → loop. The model is one box in that chain, and often the smallest. Master the messy early boxes and you can tackle any dataset — which is exactly what a portfolio project needs.

Tips & extra challenges

Vocabulary

Term Meaning
Pipeline The full chain of stages from raw data to a deployed model
EDA Exploratory data analysis — looking at and charting data before modelling
Data leakage When test/future information sneaks into training, faking good scores
Baseline The score of a trivial guess; your model must beat it to be useful
Feature importance How much each input column contributed to the model's decisions

Resources

Practice set

Practise on your own — work these easy → hard. Answers follow each arrow.

1. Name the stage. You're filling missing ages and turning "Sex" into numbers. Which pipeline stage is this? → Clean & explore (the cleaning part).

2. Spot the leak. A friend cleans the whole dataset using the overall median, then splits into train/test. Is that leakage? → Yes — the median was computed using test rows too, so test info leaked into training. Split first, compute the median on train only.

3. Read the baseline. 62% of passengers did not survive. Your model scores 60%. Good or bad? → Bad — it's below the "always guess did-not-survive" baseline of 62%, so it's worse than a trivial guess.

4. Choose a fix. A column is 90% empty. Fill it or drop it? → Usually drop the column — filling 90% of it invents far more data than it keeps; too little real signal remains.

5. Write the split (harder). Write the line that splits X and y into 80% train / 20% test with a fixed seed. → X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42).

6. Interpret importance (harder). Your model reports Sex: 0.45, Fare: 0.20, Age: 0.20, Pclass: 0.15. In one sentence, what's the takeaway? → Sex was by far the strongest predictor of survival; fare, age and class each mattered less and roughly equally.

Going deeper (optional)

Optional — for when you wonder how professionals keep a pipeline reliable.

Why wrap the steps in a Pipeline object? Doing cleaning and modelling in separate cells works for learning, but it's easy to accidentally leak — for example, scaling using the whole dataset. scikit-learn's Pipeline bundles your preprocessing and your model into one object that is .fit() on train only and applied identically to test, making leakage much harder. It also means the exact same steps run in deployment, so what you tested is what ships. As your projects grow, moving from loose cells to a single Pipeline is a mark of real maturity — explore sklearn.pipeline.Pipeline when you're ready.

Common mistakes & fixes

What's next

Session 14 — Deploy Your AI: you have a trained, evaluated model sitting in a notebook where only you can use it. Next you'll wrap it in a simple app with Gradio and publish it on Hugging Face Spaces — so anyone in the world can try your model from a single link.

Ibnovate · Build · Innovate
Type to search · Esc to close
Welcome back
Sign in to continue building.
Accounts are created by Ibnovate — ask your instructor for your login.
🔒