All posts
TensorFlow
Machine Learning
Regression
Deep Learning

Introduction to Linear Regression Using TensorFlow

A beginner-friendly guide to building linear regression with TensorFlow and Keras — from the math to a working model that learns y = 2x - 1 in minutes.

August 11, 2021 · 3 min read · By Kshitiz Regmi

Linear regression is the foundation of machine learning. Before exploring complex neural architectures, understanding how to model a linear relationship using gradient descent is essential — and TensorFlow's Keras API makes it elegantly simple.

The Math

The basic linear regression model:

yi=mxi+cy_i = m \cdot x_i + c

Where:

  • xix_i = input (independent variable)
  • yiy_i = output (dependent variable)
  • mm = slope (weight)
  • cc = intercept (bias)

The goal: find mm and cc that minimize the error between predictions and true values — the mean squared error:

MSE=1ni=1n(yiy^i)2\text{MSE} = \frac{1}{n} \sum_{i=1}^n (y_i - \hat{y}_i)^2

Why TensorFlow for Linear Regression?

A linear regressor is simply a 1-neuron neural network with no activation function. Using TensorFlow:

  • The training loop, backpropagation, and weight updates are handled automatically
  • The exact same code structure scales to deep networks
  • You learn the TensorFlow/Keras API on the simplest possible problem

Dataset

We'll use a tiny dataset with a hidden pattern: y=2x1y = 2x - 1

import numpy as np

X = np.array([-1.0, 0.0, 1.0, 2.0, 3.0, 4.0], dtype=float)
y = np.array([-3.0, -1.0, 1.0, 3.0, 5.0, 7.0], dtype=float)

# Verify the pattern
for xi, yi in zip(X, y):
    print(f"x={xi}, y={yi}, 2x-1={2*xi - 1}")
# x=-1.0, y=-3.0, 2x-1=-3.0 ✓
# x=0.0,  y=-1.0, 2x-1=-1.0 ✓
# ...

Building the Model

A Dense layer with 1 unit and no activation is mathematically identical to linear regression:

import tensorflow as tf
from tensorflow import keras

model = keras.Sequential([
    keras.layers.Dense(units=1, input_shape=[1])
])

model.compile(
    optimizer='sgd',            # Stochastic Gradient Descent
    loss='mean_squared_error'
)

model.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)      Output Shape     Param #
=================================================================
 dense (Dense)     (None, 1)        2
=================================================================
Total params: 2
Trainable params: 2 (1 weight = slope m, 1 bias = intercept c)

Just 2 parameters: the weight (slope mm) and the bias (intercept cc).

Training

history = model.fit(X, y, epochs=500, verbose=0)
print(f"Final MSE loss: {history.history['loss'][-1]:.8f}")
# Final MSE loss: 0.00000124

During training, SGD iteratively adjusts mm and cc to minimize MSE:

  1. Forward pass: compute y^=mx+c\hat{y} = m \cdot x + c
  2. Compute loss: MSE=mean((yy^)2)\text{MSE} = \text{mean}((y - \hat{y})^2)
  3. Backward pass: compute gradients Lm\frac{\partial L}{\partial m}, Lc\frac{\partial L}{\partial c}
  4. Update: mmlrLmm \leftarrow m - \text{lr} \cdot \frac{\partial L}{\partial m}, repeat

Making Predictions

# Should predict ≈ 2*10 - 1 = 19
pred = model.predict([10.0], verbose=0)
print(f"Prediction for x=10: {pred[0][0]:.4f}")
# Prediction for x=10: 18.9998

The model learned y=2x1y = 2x - 1 from just 6 data points!

Inspecting Learned Weights

weights, bias = model.layers[0].get_weights()
print(f"Learned slope (m): {weights[0][0]:.4f}")  # ≈ 2.0
print(f"Learned bias  (c): {bias[0]:.4f}")         # ≈ -1.0

The model recovered the exact underlying pattern.

Visualizing the Fit

import matplotlib.pyplot as plt

x_range = np.linspace(-2, 6, 100)
y_pred_line = model.predict(x_range, verbose=0).flatten()

plt.scatter(X, y, color='blue', s=80, zorder=5, label='Training data')
plt.plot(x_range, y_pred_line, color='red', linewidth=2, label='Learned fit')
plt.plot(x_range, 2*x_range - 1, color='green', linestyle='--', label='True: y=2x-1')
plt.legend()
plt.title("Linear Regression with TensorFlow")
plt.xlabel("x")
plt.ylabel("y")
plt.show()

The learned fit (red) overlaps almost exactly with the true function (green).

Loss Curve

plt.figure(figsize=(8, 4))
plt.plot(history.history['loss'])
plt.xlabel('Epoch')
plt.ylabel('MSE Loss')
plt.title('Training Loss — Convergence')
plt.yscale('log')
plt.show()

Loss drops sharply in the first ~50 epochs and plateaus near zero — classic gradient descent convergence on a convex loss surface.

From Regression to Deep Learning

This 2-parameter model is the simplest possible neural network. The Keras API scales identically to much larger models:

ModelParametersDescription
This tutorial2Linear regression: 1 Dense(1)
MNIST classifier535KDense(512) → Dense(256) → Dense(10)
ResNet-5025MDeep CNN for ImageNet
GPT-2117M–1.5BTransformer language model

The training loop (forward pass → loss → backward pass → weight update) is identical across all of them. TensorFlow and Keras handle the mechanics — you focus on architecture and data.

Next Steps

  1. Multiple features: add more Dense units in the input layer
  2. Non-linearity: add ReLU activation to approximate any function
  3. Regression on real data: try the Boston Housing or California Housing datasets
  4. Feature scaling: normalize inputs with StandardScaler for faster convergence