Welcome to Permetrics’s documentation!
PerMetrics is a python library for performance metrics of machine learning models. We aim to implement all performance metrics for problems such as regression, classification, clustering, … problems. Helping users in all field access metrics as fast as possible
Free software: GNU General Public License (GPL) V3 license
Total metrics: 112 (48 regression metrics, 20 classification metrics, 44 clustering metrics)
Documentation: https://permetrics.readthedocs.io/en/latest/
Python versions: >= 3.11
Dependencies: numpy, scipy
Quick Start
Models Document
- Regression Metrics
- EVS - Explained Variance Score
- ME - Max Error
- MAE - Mean Absolute Error
- MSE - Mean Squared Error
- MBE - Mean Bias Error
- RMSE - Root Mean Square Error
- MSLE - Mean Squared Logarithmic Error
- MedAE - Median Absolute Error
- MRE - Mean Relative Error
- MPE - Mean Percentage Error
- MAPE - Mean Absolute Percentage Error
- SMAPE - Symmetric Mean Absolute Percentage Error
- MAAPE - Mean Arctangent Absolute Percentage Error
- MASE - Mean Absolute Scaled Error
- NSE - Nash-Sutcliffe Efficiency
- NNSE - Normalized Nash-Sutcliffe Efficiency
- WI - Willmott Index of Agreement
- R - Pearson’s Correlation Coefficient
- AR - Absolute Pearson’s Correlation Coefficient
- R2 - Coefficient of Determination
- AR2 - Adjusted Coefficient of Determination
- CI - Confidence Index
- R2S - Squared Pearson’s Correlation Coefficient
- DRV - Deviation of Runoff Volume
- KGE - Kling-Gupta Efficiency
- GINI - Regression Gini
- PCD - Prediction of Change in Direction
- CE - Cross Entropy
- KLD - Kullback-Leibler Divergence
- JSD - Jensen-Shannon Divergence
- VAF - Variance Accounted For
- RAE - Relative Absolute Error
- RRSE - Root Relative Squared Error
- A10 - A10 Index
- A20 - A20 Index
- A30 - A30 Index
- NRMSE - Normalized Root Mean Square Error
- RSE - Residual Standard Error
- COV - Covariance
- COR - Correlation Coefficient
- EC - Efficiency Coefficient
- OI - Overall Index
- CRM - Coefficient of Residual Mass
- RE - Relative Error
- AE - Absolute Error
- SE - Squared Error
- SLE - Squared Log Error
- All regression metrics
- Classification Metrics
- AS - Accuracy Score
- CKS - Cohen’s Kappa Score
- F1S - F1 Score
- F2S - F2 Score
- FBS - F-Beta Score
- GINI - Gini Index
- GMS - Geometric Mean Score
- PS - Precision Score
- NPV - Negative Predictive Value
- RS - Recall Score
- SS - Specificity Score
- MCC - Matthews Correlation Coefficient
- ROC AUC Score
- LS - Lift Score
- HML - Hamming Loss
- HGL - Hinge Loss
- JSS - Jaccard Similarity Score
- KLDL - Kullback-Leibler Divergence Loss
- BSL - Brier Score Loss
- CEL - CrossEntropy Loss
- All classification metrics
- Clustering Metrics
- BHI - Ball-Hall Index
- CHI - Calinski-Harabasz Index
- XBI - Xie-Beni Index
- DBI - Davies-Bouldin Index
- BRI - Banfeld-Raftery Index
- DRI - Det-Ratio Index
- KDI - Ksq-DetW Index
- LDRI - Log Det Ratio Index
- DI - Dunn Index
- LSRI - Log SS Ratio Index
- SI - Silhouette Index
- SSEI - Sum of Squared Error Index
- MSEI - Mean Squared Error Index
- DHI - Duda-Hart Index
- BI - Beale Index
- RSI - R-Squared Index
- DBCVI - Density-Based Clustering Validation Index
- HI - Hartigan Index
- MIS - Mutual Information Score
- NMIS - Normalized Mutual Information Score
- RaS - Rand Score
- ARS - Adjusted Rand Score
- FMS - Fowlkes-Mallows Score
- HS - Homogeneity Score
- CS - Completeness Score
- VMS - V-Measure Score
- PrS - Precision Score
- ReS - Recall Score
- FS - F-Measure Score
- CDS - Czekanowski-Dice Score
- HGS - Hubert Gamma Score
- JS - Jaccard Score
- KS - Kulczynski Score
- MNS - McNemar Score
- PhS - Phi Score
- RTS - Rogers-Tanimoto Score
- RRS - Russell-Rao Score
- SS1S & SS2S - Sokal-Sneath Scores
- PuS - Purity Score
- EnS - Entropy Score
- TauS - Tau Score
- GAS - Gamma Score
- GPS - G-Plus Score
- All Clustering Metrics
Models API:
Thieu Nguyen, Giang Nguyen, and Binh Minh Nguyen. Eo-cnn: an enhanced cnn model trained by equilibrium optimization for traffic transportation prediction. Procedia Computer Science, 176:800–809, 2020.
Thieu Nguyen, Nhuan Tran, Binh Minh Nguyen, and Giang Nguyen. A resource usage prediction system using functional-link and genetic algorithm neural network for multivariate cloud metrics. In 2018 IEEE 11th conference on service-oriented computing and applications (SOCA), 49–56. IEEE, 2018.
Takeyoshi Kato. Prediction of photovoltaic power generation output and network operation. In Integration of Distributed Energy Resources in Power Systems, pages 77–108. Elsevier, 2016.
Thieu Nguyen, Binh Minh Nguyen, and Giang Nguyen. Building resource auto-scaler with functional-link neural network and adaptive bacterial foraging optimization. In International Conference on Theory and Applications of Models of Computation, 501–517. Springer, 2019.
Timothy O Hodson, Thomas M Over, and Sydney S Foks. Mean squared error, deconstructed. Journal of Advances in Modeling Earth Systems, 13(12):e2021MS002681, 2021.
Binh Minh Nguyen, Trung Tran, Thieu Nguyen, and Giang Nguyen. An improved sea lion optimization for workload elasticity prediction with neural networks. International Journal of Computational Intelligence Systems, 15(1):90, 2022.
Thieu Nguyen, Bao Hoang, Giang Nguyen, and Binh Minh Nguyen. A new workload prediction model using extreme learning machine and enhanced tug of war optimization. Procedia Computer Science, 170:362–369, 2020.
Long-Range Forecasting. From crystal ball to computer. Scott Armstrong Robert J. Genetski, 1978.
Spyros Makridakis. Accuracy measures: theoretical and practical concerns. International journal of forecasting, 9(4):527–529, 1993.
Sungil Kim and Heeyoung Kim. A new metric of absolute percentage error for intermittent demand forecasts. International journal of forecasting, 32(3):669–679, 2016.
Rob J Hyndman and Anne B Koehler. Another look at measures of forecast accuracy. International journal of forecasting, 22(4):679–688, 2006.
Chengyu Xie, Hoang Nguyen, Xuan-Nam Bui, Van-Thieu Nguyen, and Jian Zhou. Predicting roof displacement of roadways in underground coal mines using adaptive neuro-fuzzy inference system optimized by various physics-based optimization algorithms. Journal of Rock Mechanics and Geotechnical Engineering, 13(6):1452–1465, 2021.
Ali Najah Ahmed, To Van Lam, Nguyen Duy Hung, Nguyen Van Thieu, Ozgur Kisi, and Ahmed El-Shafie. A comprehensive comparison of recent developed meta-heuristic algorithms for streamflow time series forecasting problem. Applied Soft Computing, 105:107282, 2021.
Rodrigo Dlugosz da Silva, Marcelo Augusto de Aguiar, Marcelo Giovanetti Canteri, Juliandra Rodrigues Rosisca, Nilson Aparecido Vieira Junio, and others. Reference evapotranspiration for londrina, paraná, brazil: performance of different estimation methods. Semina: Ciências Agrárias, 38(4):2363–2374, 2017.
Nguyen Van Thieu, Surajit Deb Barma, To Van Lam, Ozgur Kisi, and Amai Mahesha. Groundwater level modeling using augmented artificial ecosystem optimization. Journal of Hydrology, 617:129034, 2023.
Binh Minh Nguyen, Bao Hoang, Thieu Nguyen, and Giang Nguyen. Nqsv-net: a novel queuing search variant for global space search and workload modeling. Journal of Ambient Intelligence and Humanized Computing, 12:27–46, 2021.
Edward W Frees, Glenn Meyers, and A David Cummings. Summarizing insurance scores using a gini index. Journal of the American Statistical Association, 106(495):1085–1098, 2011.
Shlomo Yitzhaki and Edna Schechtman. The Gini methodology: a primer on a statistical methodology. Volume 272. Springer Science & Business Media, 2012.
John R Hershey and Peder A Olsen. Approximating the kullback leibler divergence between gaussian mixture models. In 2007 IEEE International Conference on Acoustics, Speech and Signal Processing-ICASSP'07, volume 4, IV–317. IEEE, 2007.
Bent Fuglede and Flemming Topsoe. Jensen-shannon divergence and hilbert space embedding. In International symposium onInformation theory, 2004. ISIT 2004. Proceedings., 31. IEEE, 2004.
J Scott Armstrong and Fred Collopy. Error measures for generalizing about forecasting methods: empirical comparisons. International journal of forecasting, 8(1):69–80, 1992.
Karl G Jöreskog. Structural analysis of covariance and correlation matrices. Psychometrika, 43(4):443–477, 1978.
Rolla Almodfer, Mohamed E Zayed, Mohamed Abd Elaziz, Moustafa M Aboelmaaref, Mohammed Mudhsh, and Ammar H Elsheikh. Modeling of a solar-powered thermoelectric air-conditioning system using a random vector functional link network integrated with jellyfish search algorithm. Case Studies in Thermal Engineering, 31:101797, 2022.
Jacob Cohen. A coefficient of agreement for nominal scales. Educational and psychological measurement, 20(1):37–46, 1960.
Haibo He and Edwardo A Garcia. Learning from imbalanced data. IEEE Transactions on knowledge and data engineering, 21(9):1263–1284, 2009.
Miroslav Kubat. Addressing the curse of imbalanced training sets: one-sided selection. In Proceedings of the 14th international conference on machine learning, 179–186. Morgan Kaufmann, 1997.
Davide Chicco and Giuseppe Jurman. The advantages of the matthews correlation coefficient (mcc) over f1 score and accuracy in binary classification evaluation. BMC genomics, 21(1):6, 2020.
Tom Fawcett. An introduction to roc analysis. Pattern recognition letters, 27(8):861–874, 2006.
Koby Crammer and Yoram Singer. On the algorithmic implementation of multiclass kernel-based vector machines. Journal of machine learning research, 2(Dec):265–292, 2001.
Solomon Kullback and Richard A Leibler. On information and sufficiency. The annals of mathematical statistics, 22(1):79–86, 1951.
W Brier Glenn and others. Verification of forecasts expressed in terms of probability. Monthly weather review, 78(1):1–3, 1950.
Anqi Mao, Mehryar Mohri, and Yutao Zhong. Cross-entropy loss functions: theoretical analysis and applications. In International conference on Machine learning, 23803–23828. pmlr, 2023.
Bernard Desgraupes. Clustering indices. University of Paris Ouest-Lab Modal’X, 1(1):34, 2013.
Tadeusz Caliński and Jerzy Harabasz. A dendrite method for cluster analysis. Communications in Statistics-theory and Methods, 3(1):1–27, 1974.
Xuanli Lisa Xie and Gerardo Beni. A validity measure for fuzzy clustering. IEEE Transactions on pattern analysis and machine intelligence, 13(8):841–847, 1991.
David L Davies and Donald W Bouldin. A cluster separation measure. IEEE transactions on pattern analysis and machine intelligence, pages 224–227, 1979.
Jeffrey D Banfield and Adrian E Raftery. Model-based gaussian and non-gaussian clustering. Biometrics, pages 803–821, 1993.
Herman P Friedman and Jerrold Rubin. On some invariant criteria for grouping data. Journal of the American Statistical Association, 62(320):1159–1178, 1967.
Allen J Scott and Michael J Symons. Clustering methods based on likelihood ratio criteria. Biometrics, pages 387–397, 1971.
Joseph C Dunn. Well-separated clusters and optimal fuzzy partitions. Journal of cybernetics, 4(1):95–104, 1974.
Peter J Rousseeuw. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of computational and applied mathematics, 20:53–65, 1987.
Anthony WF Edwards and Luigi Luca Cavalli-Sforza. A method for cluster analysis. Biometrics, pages 362–375, 1965.
D. N. Sparks. Euclidean cluster analysis. Journal of the Royal Statistical Society Series C: Applied Statistics, 22(1):126–130, 03 1973. URL: https://doi.org/10.2307/2346321, doi:10.2307/2346321.
Thomas M Cover and Joy A Thomas. Elements of information theory (wiley series in telecommunications and signal processing). Wiley-interscience, 2006.