Mean Absolute Error (MAE)

Start With The Problem

You trained a house price prediction model. Now you want to know — how good is my model?

Your model made these predictions:

House

Actual Price

Predicted Price

1

₹50 L

₹45 L

2

₹80 L

₹85 L

3

₹60 L

₹58 L

4

₹90 L

₹95 L

5

₹70 L

₹65 L

You need one single number that tells you — on average, by how much is my model wrong?

That number is MAE.


What is MAE?

MAE = Average of absolute differences between actual and predicted values

Three steps only:

  1. Find the error (Actual − Predicted) for each row
  2. Make all errors positive (take absolute value)
  3. Take the average

The Formula

MAE = (1/n) × Σ |Actual - Predicted|
  • n = total number of predictions
  • | | = absolute value (just remove the minus sign)
  • Σ = sum of everything

Manual Walkthrough — Step by Step

House

Actual

Predicted

Error (A-P)

|Error|

1

50

45

+5

5

2

80

85

-5

5

3

60

58

+2

2

4

90

95

-5

5

5

70

65

+5

5

Step 1 — Sum of absolute errors:

5 + 5 + 2 + 5 + 5 = 22

Step 2 — Divide by n (5 houses):

MAE = 22 / 5 = 4.4

Result: MAE = 4.4 Lakhs

This means — on average, your model is wrong by ₹4.4 Lakhs per house. Simple and clear.


Why Absolute Value? Why Not Just Average the Errors?

Without absolute value:

Errors = +5, -5, +2, -5, +5
Sum    = +5 - 5 + 2 - 5 + 5 = 2
Avg    = 2 / 5 = 0.4

This says model is almost perfect — but it's clearly not! Positive and negative errors cancel each other out and give a false picture.

Absolute value fixes this — every error counts as positive, no cancellation.


Python Program


    import numpy as np
    import pandas as pd
    from sklearn.metrics import mean_absolute_error
    import matplotlib.pyplot as plt

    # --- Data ---
    actual    = [50, 80, 60, 90, 70]
    predicted = [45, 85, 58, 95, 65]

    # --- Manual Calculation ---
    errors          = [a - p for a, p in zip(actual, predicted)]
    absolute_errors = [abs(e) for e in errors]
    mae_manual      = sum(absolute_errors) / len(absolute_errors)

    print("=== Manual Calculation ===")
    print(f"Errors          : {errors}")
    print(f"Absolute Errors : {absolute_errors}")
    print(f"MAE (manual)    : {mae_manual}")

    # --- Using NumPy ---
    mae_numpy = np.mean(np.abs(np.array(actual) - np.array(predicted)))
    print(f"\nMAE (numpy)     : {mae_numpy}")

    # --- Using Scikit-learn ---
    mae_sklearn = mean_absolute_error(actual, predicted)
    print(f"MAE (sklearn)   : {mae_sklearn}")

    # --- DataFrame view ---
    df = pd.DataFrame({
        'Actual'         : actual,
        'Predicted'      : predicted,
        'Error'          : errors,
        'Absolute Error' : absolute_errors
    })
    print(f"\n{df.to_string(index=False)}")

    # --- Plot ---
    x = range(1, len(actual) + 1)
    plt.figure(figsize=(10, 5))
    plt.plot(x, actual,    label='Actual',    marker='o', linewidth=2)
    plt.plot(x, predicted, label='Predicted', marker='s', linewidth=2, linestyle='--')

    for i in x:
        plt.vlines(i, min(actual[i-1], predicted[i-1]),
                    max(actual[i-1], predicted[i-1]),
                    colors='red', linewidth=2, alpha=0.6)

    plt.title(f'Actual vs Predicted (MAE = {mae_sklearn})')
    plt.xlabel('House')
    plt.ylabel('Price (Lakhs)')
    plt.legend()
    plt.grid(True)
    plt.tight_layout()
    plt.savefig('mae_plot.png')
    plt.show()
    print("\nPlot saved!")

Output:

=== Manual Calculation ===

Errors          : [5, -5, 2, -5, 5]

Absolute Errors : [5, 5, 2, 5, 5]

MAE (manual)    : 4.4


MAE (numpy)     : 4.4

MAE (sklearn)   : 4.4


 Actual  Predicted  Error  Absolute Error

     50         45      5               5

     80         85     -5               5

     60         58      2               2

     90         95     -5               5

     70         65      5               5


Plot saved!

The red vertical lines in the plot show the error for each prediction — MAE is just the average length of those red lines.


How to Read MAE

MAE is in the same unit as your target variable.

Target

MAE = 4.4 means

House Price (Lakhs)

Wrong by ₹4.4L on average

Temperature (°C)

Wrong by 4.4°C on average

Sales (units)

Wrong by 4.4 units on average

This is MAE's biggest strength — it's directly interpretable. No unit conversion needed.


MAE vs Other Metrics — When to Use What

Metric

Penalizes Big Errors?

Interpretable?

Use When

MAE

No (equal weight)

Yes, same unit

Outliers exist, you want simple average error

MSE

Yes (squares errors)

Squared unit

Big errors are very bad, want to penalize them hard

RMSE

Yes

Yes, same unit

Big errors are bad but want interpretable result


The One Weakness of MAE

MAE treats all errors equally. A ₹2L error and a ₹20L error both just get added as-is.

Error of 2  → contributes 2
Error of 20 → contributes 20

If you want your model to heavily penalize large mistakes, use MSE or RMSE instead. But if outliers exist in your data and you don't want them to dominate the metric — MAE is safer.


Real ML Project Usage


    from sklearn.metrics import mean_absolute_error
    from sklearn.linear_model import LinearRegression
    from sklearn.model_selection import train_test_split

    # After training your model
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

    model = LinearRegression()
    model.fit(X_train, y_train)

    y_pred = model.predict(X_test)

    # Evaluate
    mae = mean_absolute_error(y_test, y_pred)
    print(f"Model MAE: {mae:.2f}")

    # Rule of thumb — MAE should be small relative to your target's range
    print(f"Target range : {y_test.max() - y_test.min():.2f}")
    print(f"MAE as % of range: {(mae / (y_test.max() - y_test.min())) * 100:.1f}%")

MAE as % of range is a great sanity check — if MAE is 5% of the range, your model is decent. If it's 40%, your model needs work.


One Line Summary

MAE tells you — on average, by how much is your model's prediction off from the real value — in the same unit as your data, making it the most human-readable regression metric.

Exponential Moving Average (EMA)

The Problem with SMA First

Remember SMA — it gives equal weight to all values in the window.

For a 3-day SMA on sales:

Day 1: 200, Day 2: 450, Day 3: 180

SMA = (200 + 450 + 180) / 3 = 276.6

Here, Day 1 (old data) and Day 3 (today) are treated equally. But think about it — should 2-day-old data matter as much as today's data?

In most real cases — No. Recent data is more important.

That's exactly what EMA fixes.


What is EMA?

EMA gives MORE weight to recent values and LESS weight to older values.

The further back a value is, the less it influences the average. Recent values dominate.


Weight Concept — Simple Visual

For a 3-period EMA, weights look like this:

Data Point

Weight

Today (most recent)

Highest ⬆️

Yesterday

Medium

Day before

Low

Even older

Very Low (almost ignored)

Compare this to SMA where every day gets exactly equal weight.


The Formula

EMA today = (Today's Value × α) + (Yesterday's EMA × (1 - α))

Where α (alpha) is the smoothing factor:

α = 2 / (N + 1)

For N = 3:

α = 2 / (3 + 1) = 0.5

That means — 50% weight to today, 50% to the past EMA.

For N = 10:

α = 2 / (10 + 1) = 0.18

Smaller alpha = smoother = older data still matters more.


Manual Walkthrough — Step by Step

Daily Sales data:

Day

Sales

1

200

2

450

3

180

4

500

5

220

Using N = 3, so α = 0.5

Step 1 — Day 1: No previous EMA exists, so EMA = first value itself

EMA(1) = 200

Step 2 — Day 2:

EMA(2) = (450 × 0.5) + (200 × 0.5)
       = 225 + 100
       = 325

Step 3 — Day 3:

EMA(3) = (180 × 0.5) + (325 × 0.5)
       = 90 + 162.5
       = 252.5

Step 4 — Day 4:

EMA(4) = (500 × 0.5) + (252.5 × 0.5)
       = 250 + 126.25
       = 376.25

Step 5 — Day 5:

EMA(5) = (220 × 0.5) + (376.25 × 0.5)
       = 110 + 188.12
       = 298.12

Final result:

Day

Sales

EMA (N=3)

1

200

200

2

450

325

3

180

252.5

4

500

376.25

5

220

298.12

Notice — EMA reacts faster to the spike on Day 4 (500) compared to SMA. That's the power.


SMA vs EMA — Side by Side

Feature

SMA

EMA

Weight to all values

Equal

More to recent

Reacts to sudden change

Slow

Fast

Smoother line

Yes

Slightly less smooth

NaN at start

Yes (first N rows)

No

Best for

Long-term trend

Short-term, fast signals


Python Program


    import pandas as pd
    import matplotlib.pyplot as plt

    # --- Data ---
    data = {
        'day': list(range(1, 16)),
        'sales': [200, 450, 180, 500, 220, 480, 210, 460, 190, 510, 230, 490, 200, 470, 215]
    }

    df = pd.DataFrame(data)

    # --- Calculate SMA and EMA ---
    df['SMA_3'] = df['sales'].rolling(window=3).mean()
    df['EMA_3'] = df['sales'].ewm(span=3, adjust=False).mean()  # EMA with N=3
    df['EMA_7'] = df['sales'].ewm(span=7, adjust=False).mean()  # EMA with N=7

    print(df.to_string(index=False))

    # --- Plot ---
    plt.figure(figsize=(12, 5))
    plt.plot(df['day'], df['sales'], label='Raw Sales', marker='o', linewidth=1.5, alpha=0.6)
    plt.plot(df['day'], df['SMA_3'], label='SMA (3-day)', linewidth=2, linestyle='--')
    plt.plot(df['day'], df['EMA_3'], label='EMA (3-day)', linewidth=2)
    plt.plot(df['day'], df['EMA_7'], label='EMA (7-day)', linewidth=2)

    plt.title('SMA vs EMA Comparison')
    plt.xlabel('Day')
    plt.ylabel('Sales')
    plt.legend()
    plt.grid(True)
    plt.tight_layout()
    plt.savefig('ema_vs_sma.png')
    plt.show()
    print("Plot saved!")


Output:

day sales SMA_3 EMA_3 EMA_7 1 200 NaN 200.000000 200.000000 2 450 NaN 325.000000 262.500000 3 180 276.666667 252.500000 241.875000 4 500 376.666667 376.250000 306.406250 5 220 300.000000 298.125000 284.804688 6 480 400.000000 389.062500 333.603516 7 210 303.333333 299.531250 302.702637 8 460 383.333333 379.765625 342.026978 9 190 286.666667 284.882812 304.020233 10 510 386.666667 397.441406 355.515175 11 230 310.000000 313.720703 324.136381 12 490 410.000000 401.860352 365.602286 13 200 306.666667 300.930176 324.201714 14 470 386.666667 385.465088 360.651286 15 215 295.000000 300.232544 324.238464 Plot saved!



Key Things to Remember

ewm(span=3)span is your N value, same as window in rolling

adjust=False — uses the recursive formula shown above (standard EMA). Always use this.

No NaN — EMA starts from Day 1 itself, unlike SMA which waits for N values


Where EMA is Used in ML


    # Feature Engineering with EMA
    df['ema_3']  = df['sales'].ewm(span=3,  adjust=False).mean()  # short trend
    df['ema_7']  = df['sales'].ewm(span=7,  adjust=False).mean()  # medium trend
    df['ema_21'] = df['sales'].ewm(span=21, adjust=False).mean()  # long trend

    # EMA reacts faster — great for detecting sudden changes (fraud, anomaly)
    df['deviation_from_ema'] = df['sales'] - df['ema_7']  # how far today is from trend

deviation_from_ema is a very powerful feature — if this value is very high or very low, it signals something unusual happening. Used heavily in anomaly detection and fraud detection models.


One Line Summary

EMA is a smarter Moving Average — it remembers the past but pays more attention to what just happened, making it faster to react to real changes in data.

Mean Absolute Error (MAE)

Start With The Problem You trained a house price prediction model. Now you want to know — how good is my model? Your model made these predi...