MPE - Mean Percentage Error

The Mean Percentage Error (MPE) is a statistical metric used to measure the systematic bias of a forecasting model in percentage terms.

Unlike the Mean Absolute Percentage Error (MAPE), MPE does not use absolute values. By preserving the sign of the errors, it allows positive and negative errors to cancel each other out, revealing whether the model has a consistent tendency to over-forecast or under-forecast.

\[\text{MPE}(y, \hat{y}) = \frac{100\%}{N} \sum_{i=1}^{N} \frac{y_i - \hat{y}_i}{y_i}\]

Note: In permetrics, the numerator is calculated as Actual minus Predicted \((y_i - \hat{y}_i)\). Therefore, a positive MPE indicates that the model systematically under-predicts, while a negative MPE indicates over-prediction.


Description

Key Insight: The Percentage Bias Indicator MPE is to MAPE exactly what Mean Bias Error (MBE) is to Mean Absolute Error (MAE). Because errors cancel out, a perfect MPE of 0.0 does not mean the model’s predictions are perfectly accurate; it simply means that the percentage of over-predictions perfectly balances the percentage of under-predictions. MPE should never be used alone to evaluate accuracy; it is a diagnostic tool meant to be paired with MAPE or RMSE.

Advantages:
  • Directional Percentage Diagnostic: It provides an incredibly intuitive way for businesses to understand systematic bias. Telling stakeholders “our inventory model has a -15% MPE” immediately communicates that you are systematically over-stocking by about 15%.

  • Scale-Independence: Because the error is normalized by the actual value, it can evaluate bias across entirely different datasets, product lines, or scales.

Disadvantages:
  • The Zero-Value Trap (Critical Flaw): Like all percentage errors, if the actual ground truth value (\(y_i\)) is exactly 0.0, the calculation involves division by zero and will instantly crash or return NaN/Inf.

  • Cancellation Illusion: Highly inaccurate models can still achieve an MPE close to zero if their wild over-predictions and under-predictions happen to average out.


Properties

  • Best possible score: 0.0 (Indicates zero systematic percentage bias).

  • Range: (-inf, +inf)
    • MPE > 0: The model systematically underestimates (Actual > Predicted).

    • MPE < 0: The model systematically overestimates (Actual < Predicted).

  • Mathematical Reference: Dataquest Regression Metrics


Example Usage

Note: Ensure your ground truth data does not contain zero values to avoid division-by-zero errors. Data should ideally be strictly positive.

from numpy import array
from permetrics.regression import RegressionMetric

## 1. For 1-D array (Single-output)
y_true = array([3, 0.5, 2, 7])
y_pred = array([2.5, 0.6, 2, 8])

evaluator = RegressionMetric(y_true, y_pred)
# Calculate Mean Percentage Error
print("MPE: ", evaluator.MPE())

## 2. For > 1-D array (Multi-output)
y_true = array([[0.5, 1], [0.1, 1], [7, 6]])
y_pred = array([[0.6, 2], [0.1, 2], [8, 5]])

evaluator = RegressionMetric(y_true, y_pred)
# Return an array of scores for each column
print("MPE (Multi-output): ", evaluator.MPE(multi_output="raw_values"))