EC - Efficiency Coefficient

The Efficiency Coefficient (EC) [22] (mathematically identical to the Nash-Sutcliffe Efficiency or the Coefficient of Determination \(R2\)) is a statistical metric used to evaluate the predictive accuracy of continuous regression models.

It assesses the model’s predictive skill relative to a baseline “no-knowledge” benchmark model (which simply predicts the mean of the observed data).

\[\text{EC}(y, \hat{y}) = 1 - \frac{\sum_{i=1}^{N} (y_i - \hat{y}_i)^2}{\sum_{i=1}^{N} (y_i - \bar{y})^2}\]

Note: \(\bar{y}\) represents the mean of the actual observed values.


Description

Advantages:

  • Built-in Baseline Comparison: Highly interpretable.
    • EC = 1.0: Perfect prediction.

    • EC = 0.0: The model is only as accurate as predicting the constant mean of the observed data.

    • EC < 0.0: The model is worse than simply guessing the mean.

  • Scale-independent: Because the error variance is normalized by the data’s inherent variance, EC can be used to compare model performance across entirely different datasets and domains.

Disadvantages:
  • Extreme Outlier Sensitivity: Because both the numerator (residual error) and denominator (total variance) are squared, a single massive outlier can disproportionately crash the EC score, even if the model performs perfectly on 99% of the remaining data.

  • Non-linear scaling: The difference in model quality between an EC of 0.90 and 0.95 is vastly more significant than the difference between 0.20 and 0.25, making linear performance comparisons tricky.


Properties

  • Best possible score: 1.0 (Bigger value is better).

  • Range: (-inf, 1.0]

  • Mathematical Reference: ScienceDirect (CSITE)


Example Usage

from numpy import array
from permetrics.regression import RegressionMetric

## 1. For 1-D array (Single-output)
y_true = array([3, -0.5, 2, 7])
y_pred = array([2.5, 0.0, 2, 8])

evaluator = RegressionMetric(y_true, y_pred)
# Calculate Efficiency Coefficient
print("EC: ", evaluator.EC())

## 2. For > 1-D array (Multi-output)
y_true = array([[0.5, 1], [-1, 1], [7, -6]])
y_pred = array([[0, 2], [-1, 2], [8, -5]])

evaluator = RegressionMetric(y_true, y_pred)
# Return an array of scores for each column
print("EC (Multi-output): ", evaluator.EC(multi_output="raw_values"))