Sitemap

Understanding the Bias–Variance Trade-off

5 min readAug 24, 2025
Press enter or click to view image in full size
Photo by Patrick Fore on Unsplash

One of the most fundamental concepts in Machine Learning is the bias–variance trade-off. It explains why some models are too simple and fail to capture patterns in the data, while others are too complex and fail to generalize to new, unseen data.

In this blog post, we will break down bias and variance in plain English and then demonstrate them using the Iris dataset with Decision Trees in Python. By the end, you’ll not only understand what these terms mean but also see them in action with simple experiments.

What is Bias?

Bias is the error introduced by making overly simplistic assumptions in a model. Think of bias as a form of “stubbornness” — no matter how much data you provide, the model insists on sticking to an overly simple explanation.

  • A high bias model oversimplifies the problem.
  • It fails to capture the true patterns in data.
  • As a result, both training accuracy and test accuracy are low.
https://sourcesofinsight.com/biases/

Example in real life: Imagine trying to predict house prices using only the number of bedrooms. You are ignoring all other features (location, size, age, amenities). Your predictions will be consistently wrong because the model is too simplistic.

What is Variance?

Variance refers to how much a model’s predictions change if we train it on a different dataset (or even a different split of the same dataset).
Think of variance as over-sensitivity — the model is like a student who memorizes answers to practice questions but fails to generalize when the actual exam questions are slightly different.

  • A high-variance model fits the training data extremely well.
  • It performs poorly on test data because it fails to generalize.
  • Training accuracy is high, but test accuracy fluctuates a lot.
Press enter or click to view image in full size
https://shloogydotcom2.wordpress.com/2013/05/08/533/eliize-variance-poker-meme/

Example in real life: Imagine memorizing an entire textbook word for word instead of understanding the concepts. You’ll ace practice questions (training data) but struggle with slightly modified exam questions (test data).

The Bias–Variance Trade-off

  • High Bias, Low Variance → The model is consistently wrong in the same way. It’s simple, predictable, but inaccurate (underfitting).
  • Low Bias, High Variance → The model gets training data perfectly but struggles badly on new data. It’s complex, unstable, and unreliable (overfitting).
  • The sweet spot lies somewhere in between — a model that is complex enough to learn important patterns, but simple enough to generalize well.

Target Board (Archery) Analogy 🎯

Imagine a dartboard (bullseye).

  • High Bias, Low Variance: All darts are far from the center but clustered together.
  • Low Bias, High Variance: Darts are spread out widely, some close to the center, some far away.
  • High Bias, High Variance: Darts are far from the center and scattered.
  • Low Bias, Low Variance (Ideal): Darts are tightly clustered at the bullseye.
Press enter or click to view image in full size
https://www.cs.cornell.edu/courses/cs4780/2018fa/lectures/lecturenote12.html

Why Decision Trees for Demonstration?

Decision Trees are perfect for demonstrating bias and variance because their complexity can be directly controlled using the depth parameter.

  • A very shallow tree (depth = 1) makes extremely simple decisions, leading to high bias.
  • A very deep tree (depth = 10 or more) memorizes the training data, leading to high variance.

This makes Decision Trees an ideal tool to show the two extremes of the trade-off.

We’ll use the famous Iris dataset (classification of flowers into 3 species) and run experiments where:

  1. We train a decision tree multiple times with different random train-test splits.
  2. We compare the training and test accuracy across runs.
  3. We visualize how stable (or unstable) the results are.
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split

# Load data
X, y = load_iris(return_X_y=True)

def experiment(tree_depth, n_experiments=20, test_size=0.3):
train_scores, test_scores = [], []
for _ in range(n_experiments):
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=test_size, stratify=y
)
model = DecisionTreeClassifier(max_depth=tree_depth, random_state=None)
model.fit(X_train, y_train)
train_scores.append(model.score(X_train, y_train))
test_scores.append(model.score(X_test, y_test))
return train_scores, test_scores

# High Bias, Low Variance (depth=1)
train_low, test_low = experiment(tree_depth=1)

# Low Bias, High Variance (depth=10)
train_high, test_high = experiment(tree_depth=10)

# Plotting
plt.figure(figsize=(12,5))

# Depth=1
plt.subplot(1,2,1)
plt.plot(train_low, label="Train Accuracy", marker="o")
plt.plot(test_low, label="Test Accuracy", marker="s")
plt.title("High Bias, Low Variance (Tree depth=1)")
plt.xlabel("Experiment")
plt.ylabel("Accuracy")
plt.ylim(0.4,1.05)
plt.legend()

# Depth=10
plt.subplot(1,2,2)
plt.plot(train_high, label="Train Accuracy", marker="o")
plt.plot(test_high, label="Test Accuracy", marker="s")
plt.title("Low Bias, High Variance (Tree depth=10)")
plt.xlabel("Experiment")
plt.ylabel("Accuracy")
plt.ylim(0.4,1.05)
plt.legend()

plt.tight_layout()
plt.show()
Press enter or click to view image in full size

Results & Discussion

  1. Tree Depth = 1 (High Bias, Low Variance)
  • Training accuracy hovers around ~60–70%.
  • Test accuracy also remains low but relatively stable across experiments.
  • This shows that the model is consistently wrong in the same way → underfitting.

2. Tree Depth = 10 (Low Bias, High Variance)

  • Training accuracy is almost perfect (close to 100%).
  • Test accuracy fluctuates wildly between experiments.
  • This shows that the model memorized training data but failed to generalize → overfitting.

Key Takeaways

  • High Bias → The model is too simple → underfits → consistently wrong.
  • High Variance → The model is too complex → overfits → unstable predictions.
  • The best models balance the two, capturing important patterns without memorizing noise.

In real-world practice, this balance is often achieved using techniques like cross-validation, regularization, or ensemble methods (e.g., Random Forests).

Bias–variance trade-off is not just a theoretical concept; it affects every model you train. By running small experiments like this, you can see underfitting and overfitting in action, which makes it much easier to understand why some models fail in production.

Decision Trees make a great teaching tool here, but the same principles apply to neural networks, regression models, and even modern deep learning systems.

Finding the right balance is the art and science of Machine Learning. 🌿

--

--

No responses yet