Introduction
PerMetrics is library written in Python, for PERformance METRICS (PerMetrics) of machine learning models.
- The goals of this framework are:
Combine all metrics for regression, classification and clustering models
Helping users in all field access to metrics as fast as possible
Perform Qualitative Analysis of models.
Perform Quantitative Analysis of models.
- Currently, it contains 2 sub-packages including:
regression: contains 47 metrics
classification: contains 17 metrics
If you see my code and data useful and use it, please cites my works here:
@software{thieu_nguyen_2020_3951205,
author = {Thieu Nguyen},
title = {A framework of PERformance METRICS (PerMetrics) for artificial intelligence models},
month = jul,
year = 2020,
publisher = {Zenodo},
doi = {10.5281/zenodo.3951205},
url = {https://doi.org/10.5281/zenodo.3951205}
}
Setup
Install the [current PyPI release](https://pypi.python.org/pypi/permetrics):
This is a simple example:
pip install permetrics==1.4.0
Or install the development version from GitHub:
pip install git+https://github.com/thieu1995/permetrics
Examples
Functional Style
This is traditional way to call a specific metric you want to use. Everytime you want to use a function, you need to pass y_true and y_pred
## 1. Import packages, classes
## 2. Create object
## 3. From object call function and use
import numpy as np
from permetrics.regression import RegressionMetric
y_true = np.array([3, -0.5, 2, 7, 5, 6])
y_pred = np.array([2.5, 0.0, 2, 8, 5, 6])
evaluator = RegressionMetric()
## 3.1 Call specific function inside object, each function has 3 names like below
rmse_1 = evaluator.RMSE(y_true, y_pred)
rmse_2 = evaluator.rmse(y_true, y_pred)
rmse_3 = evaluator.root_mean_squared_error(y_true, y_pred)
print(f"RMSE: {rmse_1}, {rmse_2}, {rmse_3}")
mse = evaluator.MSE(y_true, y_pred)
mae = evaluator.MAE(y_true, y_pred, decimal=5)
print(f"MSE: {mse}, MAE: {mae}")
import numpy as np
from permetrics.classification import ClassificationMetric
y_true = [0, 1, 0, 0, 1, 0]
y_pred = [0, 1, 0, 0, 0, 1]
evaluator = ClassificationMetric()
ps1 = evaluator.precision_score(y_true, y_pred, decimal=5)
ps2 = evaluator.ps(y_true, y_pred, decimal=3)
ps3 = evaluator.PS(y_true, y_pred, decimal=4)
print(f"Precision: {ps1}, {ps2}, {ps3}")
recall = evaluator.recall_score(y_true, y_pred)
accuracy = evaluator.accuracy_score(y_true, y_pred)
print(f"recall: {recall}, accuracy: {accuracy}")
OOP Style
This is modern and better way to use metrics. You only need to pass y_true, y_pred one time when creating metric object.
After that, you can get the value of any metrics without passing y_true, y_pred
## 1. Import packages, classes
## 2. Create object
## 3. From object call function and use
import numpy as np
from permetrics.regression import RegressionMetric
y_true = np.array([3, -0.5, 2, 7, 5, 6])
y_pred = np.array([2.5, 0.0, 2, 8, 5, 6])
evaluator = RegressionMetric(y_true, y_pred, decimal=5)
## Get the result of any function you want to
rmse = evaluator.RMSE()
mse = evaluator.MSE()
mae = evaluator.MAE()
print(f"RMSE: {rmse}, MSE: {mse}, MAE: {mae}")
import numpy as np
from permetrics.classification import ClassificationMetric
y_true = [0, 1, 0, 0, 1, 0]
y_pred = [0, 1, 0, 0, 0, 1]
evaluator = ClassificationMetric(y_true, y_pred, decimal=5)
## Get the result of any function you want to
hamming_loss = evaluator.hamming_loss()
mcc = evaluator.matthews_correlation_coefficient()
specificity = evaluator.specificity_score()
print(f"HL: {hamming_loss}, MCC: {mcc}, specificity: {specificity}")
Multiple Metrics
To reduce coding time when using multiple metrics. There are few ways to do it with permetrics by using OOP style
import numpy as np
from permetrics.regression import RegressionMetric
y_true = np.array([3, -0.5, 2, 7, 5, 6])
y_pred = np.array([2.5, 0.0, 2, 8, 5, 6])
evaluator = RegressionMetric(y_true, y_pred, decimal=5)
## Define list of metrics you want to use
list_metrics = ["RMSE", "MAE", "MAPE", "NSE"]
## 1. Get list metrics by using loop
list_results = []
for metric in list_metrics:
list_results.append( evaluator.get_metric_by_name(metric) )
print(list_results)
## 2. Get list metrics by using function
dict_result_2 = evaluator.get_metrics_by_list_names(list_metrics)
print(dict_result_2)
## 3. Get list metrics by using function and parameters
dict_metrics = {
"RMSE": {"decimal": 5},
"MAE": {"decimal": 4},
"MAPE": None,
"NSE": {"decimal": 3},
}
dict_result_3 = evaluator.get_metrics_by_dict(dict_metrics)
print(dict_result_3)
import numpy as np
from permetrics.classification import ClassificationMetric
y_true = [0, 1, 0, 0, 1, 0]
y_pred = [0, 1, 0, 0, 0, 1]
evaluator = ClassificationMetric(y_true, y_pred, decimal=5)
## 1. Get list metrics by using loop
list_metrics = ["PS", "RS", "LS", "SS"]
list_results = []
for metric in list_metrics:
list_results.append( evaluator.get_metric_by_name(metric) )
print(list_results)
## 2. Get list metrics by using function
dict_result_2 = evaluator.get_metrics_by_list_names(list_metrics)
print(dict_result_2)
## 3. Get list metrics by using function and parameters
dict_metrics = {
"PS": {"average": "micro"},
"RS": {"average": "macro"},
"LS": None,
"SS": {"average": "weighted"},
}
dict_result_3 = evaluator.get_metrics_by_dict(dict_metrics)
print(dict_result_3)
Multiple Outputs Multiple Metrics
Scikit-learn library is limited with multi-output metrics, but permetrics can produce multi-output for all of metrics
import numpy as np
from permetrics.regression import RegressionMetric
## This y_true and y_pred have 4 columns, 4 outputs
y_true = np.array([ [3, -0.5, 2, 7],
[5, 6, -0.3, 9],
[-11, 23, 8, 3.9] ])
y_pred = np.array([ [2.5, 0.0, 2, 8],
[5.2, 5.4, 0, 9.1],
[-10, 23, 8.2, 4] ])
evaluator = RegressionMetric(y_true, y_pred, decimal=5)
## 1. By default, all metrics can automatically return the multi-output results
# rmse = evaluator.RMSE()
# print(rmse)
## 2. If you want to take mean of all outputs, can set the parameter: multi-output = "mean"
# rmse_2 = evaluator.RMSE(multi_output="mean")
# print(rmse_2)
## 3. If you want a specific metric has more important than other, you can set weight for each output.
# rmse_3 = evaluator.RMSE(multi_output=[0.5, 0.05, 0.1, 0.35])
# print(rmse_3)
## Get multiple metrics with multi-output or single-output by parameters
## 1. Get list metrics by using list_names
list_metrics = ["RMSE", "MAE", "MSE"]
list_paras = [
{"decimal": 3, "multi_output": "mean"},
{"decimal": 4, "multi_output": [0.5, 0.2, 0.1, 0.2]},
{"decimal": 5, "multi_output": "raw_values"}
]
dict_result_1 = evaluator.get_metrics_by_list_names(list_metrics, list_paras)
print(dict_result_1)
## 2. Get list metrics by using dict_metrics
dict_metrics = {
"RMSE": {"decimal": 5, "multi_output": "mean"},
"MAE": {"decimal": 4, "multi_output": "raw_values"},
"MSE": {"decimal": 2, "multi_output": [0.5, 0.2, 0.1, 0.2]},
}
dict_result_2 = evaluator.get_metrics_by_dict(dict_metrics)
print(dict_result_2)
Important links
Official source code repo: https://github.com/thieu1995/permetrics
Official document: https://permetrics.readthedocs.io/
Download releases: https://pypi.org/project/permetrics/
Issue tracker: https://github.com/thieu1995/permetrics/issues
- This project also related to my another projects which are “meta-heuristics” and “neural-network”, check it here