Ibnovate Course 3 · The Future Builders
⏱ 75 minLive session

Session 3 — Build a Neural Network

Duration: 75 min · Format: live online

What you'll learn: by the end, you can build a real neural network with Keras/TensorFlow, train it on a real dataset of images, and measure how well it does — all in a handful of lines. You'll meet Sequential, Dense, compile, fit and evaluate, the five commands you'll use in every project from here on.

Soft skill focus — Resilience

Today you'll also grow Resilience. Your first training run will probably throw a shape error, or hit low accuracy, or warn about something. That's not failure — that's normal. Everyone who builds models reads red error text and keeps going. Resilience is the muscle that turns a broken run into a working one.

What you'll need

Hook

Last session you trained one weight by hand, and it took a dozen lines. A real network has thousands of weights across several layers. Writing all those gradient updates yourself would take pages.

Here's the good news: you never have to. Keras — the friendly front end of Google's TensorFlow — runs the entire training loop for you. You describe the shape of the network, hand it the data, and it does the predicting, the loss, and the weight updates automatically. In the next 75 minutes you'll build a network that looks at a handwritten digit and tells you which number it is — and it'll get most of them right.

Teach — The five commands

Almost every Keras model is built from the same five steps. Learn these once and you can build anything.

  1. Sequential — a stack of layers, one after another. You list the layers and Keras wires them together.
  2. Dense — a fully-connected layer: every neuron connects to every input, exactly like Session 1. You say how many neurons and which activation.
  3. compile — tell the model how to learn: which loss to measure and which optimizer (the gradient-descent engine from Session 2, usually "adam") to use.
  4. fit — run the training loop for a number of epochs. This is where the learning actually happens.
  5. evaluate — test the trained model on data it has never seen, to get an honest accuracy.

That's the whole workflow. The training loop you built by hand in Session 2 is hiding inside fit — same loss, same gradient descent, same repeat, just done for thousands of weights at once.

The training loop: data in, prediction, loss, adjust the weights, repeat

Teach — The data: MNIST digits

You'll train on MNIST — 70,000 small grayscale images of handwritten digits 09, the classic first dataset in deep learning. Each image is 28×28 pixels.

Two things you always do to image data before training:

Keras splits the data for you into a training set (to learn from) and a test set (to be judged on) — never test on the data you trained on, or you're just checking its memory.

Activity — Build and train a real network

Open a fresh Colab notebook. You'll build this in three short cells so you can see each part work before moving on.

Cell 1 — load and prepare the data. Type and run this:

import tensorflow as tf

# Load MNIST: 60,000 training images, 10,000 test images
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

# Normalize pixels from 0-255 down to 0-1
x_train = x_train / 255.0
x_test  = x_test  / 255.0

print("training images:", x_train.shape)   # (60000, 28, 28)
print("one label:", y_train[0])            # a digit 0-9

Cell 2 — build and compile the network. Type and run this:

model = tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),   # 28x28 image -> 784 numbers
    tf.keras.layers.Dense(128, activation="relu"),   # a hidden layer of 128 neurons
    tf.keras.layers.Dense(10, activation="softmax"), # 10 outputs: one score per digit
])

model.compile(
    optimizer="adam",                          # the gradient-descent engine
    loss="sparse_categorical_crossentropy",    # loss for whole-number class labels
    metrics=["accuracy"],                      # report accuracy as it trains
)

model.summary()   # prints the layers and how many weights it will learn

Cell 3 — train it, then judge it. Type and run this:

# fit = run the training loop for 5 epochs
model.fit(x_train, y_train, epochs=5)

# evaluate = honest score on images it has never seen
test_loss, test_acc = model.evaluate(x_test, y_test)
print("test accuracy:", round(test_acc, 4))

Now read your results:

  1. As fit runs, does the accuracy climb each epoch (e.g. 0.92 → 0.95 → 0.97)? That's the training loop working — loss down, accuracy up.
  2. What was your final test accuracy? Anything around 0.97 (97%) means your network correctly reads about 97 out of 100 unseen digits.
  3. Look at model.summary(). The bottom line shows total params — that's how many weights your loop just tuned. It's in the hundreds of thousands. Imagine setting those by hand!

You just built and trained a real deep-learning model. Everything from here is variations on these three cells.

Check yourself

  1. What do compile, fit and evaluate each do?compile sets the loss and optimizer; fit runs the training loop; evaluate scores the model on unseen test data.
  2. Why divide the pixels by 255? → To normalize them to 0–1 — small, well-scaled inputs make training faster and more stable.
  3. Why is test accuracy more honest than training accuracy? → Because the test set is data the model has never seen, so it measures real learning, not memorising.

Wrap-up

You built a neural network with Sequential and Dense layers, told it how to learn with compile, trained it with fit, and judged it honestly with evaluate — and it reads handwritten digits at around 97% accuracy. Those five commands are the backbone of every model in this course.

Tips & extra challenges

Vocabulary

Term Meaning
Keras The friendly high-level API for building networks on TensorFlow
Sequential A model that stacks layers one after another
Dense layer A fully-connected layer — every neuron sees every input
Optimizer The gradient-descent engine that updates weights (e.g. Adam)
Test set Held-out data, unseen in training, used for an honest score

Resources

Practice set

Practise on your own — work these easy → hard. Answers follow each arrow.

1. Name the step. Which command actually runs the training loop and updates the weights? → fit.

2. Read the shape. x_train.shape prints (60000, 28, 28). How many images are there, and how big is each? → 60,000 images, each 28 × 28 pixels.

3. Why softmax? The last layer is Dense(10, activation="softmax"). Why 10, and why softmax? → 10 because there are 10 digit classes (09); softmax turns the scores into probabilities that add up to 1.

4. Fix the prep. A classmate's model trains terribly and they forgot to normalize. Write the one line that fixes it. → x_train = x_train / 255.0 (and the same for x_test) — scaling pixels to 0–1.

5. Change the network (harder). In code, add a hidden layer of 64 ReLU neurons between the existing 128-neuron layer and the output. Write the new Sequential list. →

model = tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(128, activation="relu"),
    tf.keras.layers.Dense(64, activation="relu"),
    tf.keras.layers.Dense(10, activation="softmax"),
])

Going deeper (optional)

Optional — for when you want to know what the loss and optimizer names mean.

What is sparse_categorical_crossentropy, and why Adam? Session 2 used squared error, which is perfect for predicting a number. But here the model predicts which class out of ten — so we need a loss built for probabilities. Cross-entropy measures how far the predicted probability for the correct digit is from 1: confident-and-right gives tiny loss, confident-and-wrong gives huge loss. The "sparse" part just means your labels are plain integers (7) rather than one-hot vectors. As for Adam — it's gradient descent from Session 2 with two upgrades: it gives each weight its own learning rate and it keeps a little momentum so it rolls through flat spots. It's the default optimizer for good reason; you'll reach for it in nearly every project.

Common mistakes & fixes

What's next

Session 4 — Deep Vision with CNNs: your Dense network flattens an image into a line of numbers and loses all the shapes. Next you'll build a convolutional neural network that keeps the picture as a picture — finding edges, corners and objects the way real computer-vision models do.

Ibnovate · Build · Innovate
Type to search · Esc to close
Welcome back
Sign in to continue building.
Accounts are created by Ibnovate — ask your instructor for your login.
🔒