FBS - F-Beta Score
The F-Beta Score (FBS) is the generalized form of the F-score. It serves as a single abstraction layer that weighs Recall \(\beta\) times as much as Precision, allowing machine learning engineers to explicitly manipulate the evaluation trade-off between False Positives and False Negatives.
Where \(\beta\) (beta) is a positive real factor determining the weight of Recall relative to Precision.
Engineering Insight: Tuning the \(\beta\) Parameter
The selection of \(\beta\) dictates the strict behavioral policy of your optimization target:
\(\beta = 1.0\): Standard harmonic mean (F1 Score). False Positives and False Negatives are penalized equally.
\(\beta < 1.0\) (e.g., \(0.5\)): Attaches more weight to Precision than Recall. Use this when false alarms (\(FP\)) are unacceptable (e.g., an automated system recommending expensive, irrevocable stock trades).
\(\beta > 1.0\) (e.g., \(2.0\)): Attaches more weight to Recall than Precision. Use this when missed detections (\(FN\)) are catastrophic (e.g., pedestrian detection in autonomous vehicles).
- Boundary Limits
As \(\beta \to 0\), the formula degrades mathematically into pure Precision.
As \(\beta \to +\infty\), the formula degrades mathematically into pure Recall.
Averaging Strategies (Multiclass / Multilabel)
When handling more than two classes, the F-beta score is computed per class and then aggregated using one of the following average parameters:
None: Returns an array/dictionary of independent F-beta scores for each individual class.
macro: Calculates the unweighted arithmetic mean of the class F-beta scores.
micro: Calculates globally by aggregating the total true positives, false negatives, and false positives across all classes.
weighted: Calculates the F-beta score for each class and computes their average weighted by class support, counteracting the distortion of macro-averaging on highly imbalanced datasets.
Properties
Best possible score:
1.0Worst possible score:
0.0Range:
[0.0, 1.0]References: Scikit-Learn fbeta_score Documentation, Wikipedia - F-score
Example Usage
from permetrics.classification import ClassificationMetric
# ==============================================================================
# SCENARIO 1: Binary Classification
# Testing different Beta weights on the exact same prediction array
# ==============================================================================
print("--- 1. BINARY CLASSIFICATION (BETA TUNING) ---")
y_true_bin = [0, 1, 0, 0, 1, 0]
y_pred_bin = [0, 1, 0, 0, 0, 1]
cm_bin = ClassificationMetric(y_true_bin, y_pred_bin)
# 1. Favor Precision (Beta = 0.5)
print(f"FBS (beta=0.5) : {cm_bin.FBS(beta=0.5)}")
# 2. Balanced (Beta = 1.0 -> Identical to F1S)
print(f"FBS (beta=1.0) : {cm_bin.FBS(beta=1.0)}")
# 3. Favor Recall (Beta = 2.0 -> Identical to F2S)
print(f"FBS (beta=2.0) : {cm_bin.FBS(beta=2.0)}")
# ==============================================================================
# SCENARIO 2: Multiclass Classification with Integer Labels (beta=1.5)
# ==============================================================================
print("\n--- 2. MULTICLASS (INTEGER LABELS) EXAMPLES ---")
y_true_multi_int = [0, 1, 2, 0, 1, 2, 0, 2]
y_pred_multi_int = [0, 2, 1, 0, 1, 1, 0, 2]
cm_multi_int = ClassificationMetric(y_true_multi_int, y_pred_multi_int)
print(f"average=None : {cm_multi_int.FBS(beta=1.5, average=None)}")
print(f"average='macro' : {cm_multi_int.FBS(beta=1.5, average='macro')}")
print(f"average='micro' : {cm_multi_int.FBS(beta=1.5, average='micro')}")
print(f"average='weighted' : {cm_multi_int.FBS(beta=1.5, average='weighted')}")
# Filter specific classes with beta=1.5
print(f"Filter [1, 2] (average=None) : {cm_multi_int.FBS(beta=1.5, labels=[1, 2], average=None)}")
print(f"Filter [1, 2] (average='macro') : {cm_multi_int.FBS(beta=1.5, labels=[1, 2], average='macro')}")
print(f"Filter [1, 2] (average='micro') : {cm_multi_int.FBS(beta=1.5, labels=[1, 2], average='micro')}")
print(f"Filter [1, 2] (average='weighted'): {cm_multi_int.FBS(beta=1.5, labels=[1, 2], average='weighted')}")
# ==============================================================================
# SCENARIO 3: Multiclass Classification with String Labels (beta=0.75)
# ==============================================================================
print("\n--- 3. MULTICLASS (CATEGORICAL/STRING LABELS) EXAMPLES ---")
y_true_str = ["cat", "ant", "cat", "cat", "ant", "bird", "bird", "bird"]
y_pred_str = ["ant", "ant", "cat", "cat", "ant", "cat", "bird", "ant"]
cm_str = ClassificationMetric(y_true_str, y_pred_str)
print(f"average=None : {cm_str.FBS(beta=0.75, average=None)}")
print(f"average='macro' : {cm_str.FBS(beta=0.75, average='macro')}")
print(f"average='micro' : {cm_str.FBS(beta=0.75, average='micro')}")
print(f"average='weighted' : {cm_str.FBS(beta=0.75, average='weighted')}")
# Filter string labels with beta=0.75
print(f"Filter 'cat'&'bird' (average=None) : {cm_str.FBS(beta=0.75, labels=['cat', 'bird'], average=None)}")
print(f"Filter 'cat'&'bird' (average='macro') : {cm_str.FBS(beta=0.75, labels=['cat', 'bird'], average='macro')}")
print(f"Filter 'cat'&'bird' (average='micro') : {cm_str.FBS(beta=0.75, labels=['cat', 'bird'], average='micro')}")
print(f"Filter 'cat'&'bird' (average='weighted'): {cm_str.FBS(beta=0.75, labels=['cat', 'bird'], average='weighted')}")