FS - F-Measure Score
The F-Measure Score (FS) (widely known in machine learning as the F1-Score or F-beta Score, and identical to the Czekanowski-Dice Index) is an external clustering evaluation metric. It computes the harmonic mean of the pairwise Precision Score (PrS) and Recall Score (ReS).
Intuitively, FS provides a single, unified benchmark to evaluate clustering quality. It answers the question: “How well does the clustering partition balance both avoiding false-positive co-clusterings (precision) and avoiding fragmented true classes (recall)?”
For a generalized weighting parameter \(\beta\) (where \(\beta > 1\) favors Recall, and \(\beta < 1\) favors Precision):
Where across all pairs of distinct data points:
\(yy\) (True Positives): Number of pairs placed in the same cluster in both ground truth and prediction.
\(yn\) (False Negatives): True intra-class pairs incorrectly split across different predicted clusters.
\(ny\) (False Positives): Distinct true classes incorrectly grouped into the same predicted cluster.
Algorithmic Optimizations (Performance Note)
Calculating pairwise harmonized scores via brute-force enumeration scales quadratically at \(O(N^2)\).
This implementation extracts the exact pair totals (\(yy\), \(yn\), and \(ny\)) directly from the algebraic dot products of the Contingency Matrix marginals. This reduces the computational runtime to :math:`O(N)`, allowing instant evaluation on large-scale datasets.
Handling Edge Cases (Finite Values)
The calculation involves division by the sum of Precision and Recall. If both \(\text{PrS}\) and \(\text{ReS}\) evaluate to zero (meaning the model achieved zero true-positive pair groupings, \(yy = 0\)), the denominator becomes zero, causing an undefined mathematical division.
force_finite (bool): If
True, catches the zero-division error and returns a safe fallback value instead of raising aZeroDivisionError. Default isTrue.finite_value (float): The fallback value returned when
force_finite=Trueand the calculation fails. Since the worst possible valid score is 0.0, the default fallback is0.0.
Properties
Best possible score:
1.0(Indicates identical partitions where precision and recall are both 100%).Worst possible score:
0.0(The predicted clustering shares zero co-clustered pairs with the ground truth).Permutation Invariance: The metric is strictly invariant to permutations of cluster labels.
Symmetry: When \(\beta = 1.0\), the score is symmetric: \(\text{FS}_1(y_{true}, y_{pred}) = \text{FS}_1(y_{pred}, y_{true})\).
Mathematical Identity: \(\text{FS}_1 \equiv \text{Czekanowski-Dice Index} \equiv \text{Ochiai Index}\).
Range:
[0.0, 1.0]References: Desgraupes, Bernard. “Clustering indices.” University of Paris Ouest-Lab Modal’X 1.1 (2013): 34.
Example Usage
from permetrics.clustering import ClusteringMetric
# ==============================================================================
# SCENARIO 1: Basic Evaluation (Balanced F1-Score)
# ==============================================================================
print("--- 1. BASIC F-MEASURE SCORE EXAMPLE ---")
y_true = [0, 0, 1, 1, 2, 2]
y_pred = [0, 0, 1, 1, 1, 2]
cm = ClusteringMetric(y_true=y_true, y_pred=y_pred)
fms_score = cm.FS()
print(f"F-Measure Score (Beta=1.0): {fms_score}")
# ==============================================================================
# SCENARIO 2: Favoring Recall via Beta parameter
# ==============================================================================
print("\n--- 2. WEIGHTED F-MEASURE EXAMPLE ---")
# Weigh Recall twice as much as Precision (F2-Score)
f2_score = cm.FS(beta=2.0)
print(f"F-Measure Score (Beta=2.0): {f2_score}")