VMS - V-Measure Score

The V-Measure Score (VMS) is an external clustering evaluation metric that calculates the harmonic mean of the Homogeneity Score (HS) and the Completeness Score (CS). It provides a single, balanced score to measure the overall goodness of a clustering partition.

Intuitively, V-Measure acts similarly to the F1-score in classification, but applies to information-theoretic clustering properties. It answers the question: “How well does the clustering partition maximize both cluster purity (homogeneity) and class assignment coverage (completeness)?” A score of 1.0 indicates a perfect match where all clusters are completely pure and all classes are fully recovered.

\[\text{VMS} = \frac{(1 + \beta) \times \text{HS} \times \text{CS}}{(\beta \times \text{HS}) + \text{CS}}\]

By default, \(\beta = 1.0\), which assigns equal weight to both components, simplifying the formulation to:

\[\text{VMS}_1 = \frac{2 \times \text{HS} \times \text{CS}}{\text{HS} + \text{CS}}\]

Where:

  • \(\text{HS}\) is the Homogeneity Score.

  • \(\text{CS}\) is the Completeness Score.

  • \(\beta\) is the weight parameter. If \(\beta > 1\), completeness is weighted more heavily; if \(\beta < 1\), homogeneity is favored.


Handling Edge Cases (Finite Values)

The calculation of VMS involves division by the sum of Homogeneity and Completeness. If both \(\text{HS}\) and \(\text{CS}\) are exactly zero (indicating the clustering partition is completely uninformative and random), the denominator becomes zero, causing an undefined mathematical operation.

  • force_finite (bool): If True, the function catches the zero-division error when \(\text{HS} + \text{CS} = 0\) and returns a safe fallback value instead of raising a ValueError. Default is True.

  • finite_value (float): The specific fallback value returned when force_finite=True and the calculation fails. Since the worst possible valid score is 0.0, the default fallback is 0.0.


Properties


Example Usage

from permetrics.clustering import ClusteringMetric

# ==============================================================================
# SCENARIO 1: Basic Evaluation (Balanced Beta = 1.0)
# ==============================================================================
print("--- 1. BASIC V-MEASURE SCORE EXAMPLE ---")

y_true = [0, 0, 1, 1, 2, 2]
y_pred = [0, 0, 1, 1, 2, 2]

cm = ClusteringMetric(y_true=y_true, y_pred=y_pred)
vms_score = cm.VMS()
print(f"V-Measure Score (Beta=1.0): {vms_score}")

# ==============================================================================
# SCENARIO 2: Adjusting Beta Weight
# ==============================================================================
print("\n--- 2. CUSTOM BETA WEIGHT EXAMPLE ---")

# Favor Homogeneity by setting beta < 1.0
vms_custom = cm.VMS(beta=0.5)
print(f"V-Measure Score (Beta=0.5): {vms_custom}")