Clustering Metrics

STT

Metric

Metric Fullname

Characteristics

1

BHI

Ball Hall Index

Smaller is better (Best = 0), Range=[0, +inf)

2

XBI

Xie Beni Index

Smaller is better (Best = 0), Range=[0, +inf)

3

DBI

Davies Bouldin Index

Smaller is better (Best = 0), Range=[0, +inf)

4

BRI

Banfeld Raftery Index

Smaller is better (No best value), Range=(-inf, inf)

5

KDI

Ksq Detw Index

Smaller is better (No best value), Range=(-inf, +inf)

6

DRI

Det Ratio Index

Bigger is better (No best value), Range=[0, +inf)

7

DI

Dunn Index

Bigger is better (No best value), Range=[0, +inf)

8

CHI

Calinski Harabasz Index

Bigger is better (No best value), Range=[0, inf)

9

LDRI

Log Det Ratio Index

Bigger is better (No best value), Range=(-inf, +inf)

10

LSRI

Log SS Ratio Index

Bigger is better (No best value), Range=(-inf, +inf)

11

SI

Silhouette Index

Bigger is better (Best = 1), Range = [-1, +1]

12

SSEI

Sum of Squared Error Index

Smaller is better (Best = 0), Range = [0, +inf)

13

MSEI

Mean Squared Error Index

Smaller is better (Best = 0), Range = [0, +inf)

14

DHI

Duda-Hart Index

Smaller is better (Best = 0), Range = [0, +inf)

15

BI

Beale Index

Smaller is better (Best = 0), Range = [0, +inf)

16

RSI

R-squared Index

Bigger is better (Best=1), Range = (-inf, 1]

17

DBCVI

Density-based Clustering Validation Index

Bigger is better (Best=0), Range = [0, 1]

18

HI

Hartigan Index

Bigger is better (best=0), Range = [0, +inf)

19

MIS

Mutual Info Score

Bigger is better (No best value), Range = [0, +inf)

20

NMIS

Normalized Mutual Info Score

Bigger is better (Best = 1), Range = [0, 1]

21

RaS

Rand Score

Bigger is better (Best = 1), Range = [0, 1]

22

ARS

Adjusted Rand Score

Bigger is better (Best = 1), Range = [-1, 1]

23

FMS

Fowlkes Mallows Score

Bigger is better (Best = 1), Range = [0, 1]

24

HS

Homogeneity Score

Bigger is better (Best = 1), Range = [0, 1]

25

CS

Completeness Score

Bigger is better (Best = 1), Range = [0, 1]

26

VMS

V-Measure Score

Bigger is better (Best = 1), Range = [0, 1]

27

PrS

Precision Score

Bigger is better (Best = 1), Range = [0, 1]

28

ReS

Recall Score

Bigger is better (Best = 1), Range = [0, 1]

29

FmS

F-Measure Score

Bigger is better (Best = 1), Range = [0, 1]

30

CDS

Czekanowski Dice Score

Bigger is better (Best = 1), Range = [0, 1]

31

HGS

Hubert Gamma Score

Bigger is better (Best = 1), Range=[-1, +1]

32

JS

Jaccard Score

Bigger is better (Best = 1), Range = [0, 1]

33

KS

Kulczynski Score

Bigger is better (Best = 1), Range = [0, 1]

34

MNS

Mc Nemar Score

Bigger is better (No best value), Range=(-inf, +inf)

35

PhS

Phi Score

Bigger is better (No best value), Range = (-inf, +inf)

36

RTS

Rogers Tanimoto Score

Bigger is better (Best = 1), Range = [0, 1]

37

RRS

Russel Rao Score

Bigger is better (Best = 1), Range = [0, 1]

38

SS1S

Sokal Sneath1 Score

Bigger is better (Best = 1), Range = [0, 1]

39

SS2S

Sokal Sneath2 Score

Bigger is better (Best = 1), Range = [0, 1]

40

PuS

Purity Score

Bigger is better (Best = 1), Range = [0, 1]

41

ES

Entropy Score

Smaller is better (Best = 0), Range = [0, +inf)

42

TS

Tau Score

Bigger is better (No best value), Range = (-inf, +inf)

43

GAS

Gamma Score

Bigger is better (Best = 1), Range = [-1, 1]

44

GPS

Gplus Score

Smaller is better (Best = 0), Range = [0, 1]

Most of the clustering metrics is implemented based on the paper [20]

There are several types of clustering metrics that are commonly used to evaluate the quality of clustering results.

  • Internal evaluation metrics: These are metrics that evaluate the clustering results based solely on the data and the clustering algorithm used, without any external information. Examples of internal evaluation metrics include Silhouette Coefficient, Calinski-Harabasz Index, and Davies-Bouldin Index.

  • External evaluation metrics: These are metrics that evaluate the clustering results by comparing them to some external reference, such as expert labels or a gold standard. Examples of external evaluation metrics include Adjusted Rand score, Normalized Mutual Information score, and Fowlkes-Mallows score.

It’s important to choose the appropriate clustering metrics based on the specific problem and data at hand.

In this library, metrics that belong to the internal evaluation category will have a metric name suffix of “index” On the other hand, metrics that belong to the external evaluation category will have a metric name suffix of “score”