Accuracy, precision, recall, and F1-score are commonly used performance metrics to evaluate the effectiveness of a classification model. These metrics provide insights into different aspects of the model’s performance in predicting class labels. Here’s a brief explanation of each metric:
1. Accuracy: Accuracy measures the overall correctness of the model’s predictions. It calculates the ratio of correct predictions to the total number of predictions made. Accuracy is often used as a general performance metric when the classes are balanced (i.e., roughly equal number of samples in each class). However, it can be misleading when the classes are imbalanced.
2. Precision: Precision focuses on the proportion of correctly predicted positive instances (true positives) out of all instances predicted as positive (true positives + false positives). Precision indicates the model’s ability to avoid false positive predictions. It is useful when the cost of false positives is high, such as in medical diagnosis.
3. Recall: Recall, also known as sensitivity or true positive rate, measures the proportion of correctly predicted positive instances (true positives) out of all actual positive instances (true positives + false negatives). Recall indicates the model’s ability to capture all positive instances and avoid false negatives. It is valuable when the cost of false negatives is high, such as in detecting rare diseases.
4. F1-score: The F1-score is the harmonic mean of precision and recall. It provides a balanced measure of the model’s performance by considering both precision and recall. The F1-score is useful when you want to assess the model’s overall performance while considering both false positives and false negatives. It ranges from 0 to 1, where a higher value indicates better performance.
These metrics are commonly used in binary classification tasks, where there are two classes (e.g., positive and negative). They can also be extended to multi-class classification problems by calculating micro or macro averages across different classes.
When evaluating a classification model, it is essential to consider the specific problem, the relative importance of false positives and false negatives, and choose the appropriate performance metric(s) accordingly.
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
data = pd.read_csv('../datasets/Social_Network_Ads.csv')
data
User ID | Gender | Age | EstimatedSalary | Purchased | |
---|---|---|---|---|---|
0 | 15624510 | Male | 19 | 19000 | 0 |
1 | 15810944 | Male | 35 | 20000 | 0 |
2 | 15668575 | Female | 26 | 43000 | 0 |
3 | 15603246 | Female | 27 | 57000 | 0 |
4 | 15804002 | Male | 19 | 76000 | 0 |
… | … | … | … | … | … |
395 | 15691863 | Female | 46 | 41000 | 1 |
396 | 15706071 | Male | 51 | 23000 | 1 |
397 | 15654296 | Female | 50 | 20000 | 1 |
398 | 15755018 | Male | 36 | 33000 | 0 |
399 | 15594041 | Female | 49 | 36000 | 1 |
400 rows × 5 columns
X = data.iloc[:, [2, 3]].values
y = data.iloc[:, 4].values
# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 0)
# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
from sklearn.neighbors import KNeighborsClassifier
classifier = KNeighborsClassifier(n_neighbors = 5, metric = 'minkowski', p = 2)
classifier.fit(X_train, y_train)
KNeighborsClassifier()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
KNeighborsClassifier()
y_pred = classifier.predict(X_test)
y_pred
array([0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1], dtype=int64)
# Making the Confusion Matrix
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
cm
array([[64, 4], [ 3, 29]], dtype=int64)
# Visualising the Training set results
X_set, y_set = X_test, y_test
X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1, stop = X_set[:, 0].max() + 1, step = 0.01),
np.arange(start = X_set[:, 1].min() - 1, stop = X_set[:, 1].max() + 1, step = 0.01))
Z =classifier.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape)
plt.contourf(X1, X2, Z)
plt.scatter(X_set[y_set == 0, 0], X_set[y_set == 0, 1],label = 0)
plt.scatter(X_set[y_set == 1, 0], X_set[y_set == 1, 1],label = 1)
plt.title('K-Nearest Neighbors (K-NN) (Test set)')
plt.xlabel('Age')
plt.ylabel('Estimated Salary')
plt.legend()
plt.show()
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
The confusion matrix for sklearn is as follows:
- TN | FP
- FN | TP
cm
array([[64, 4], [ 3, 29]], dtype=int64)
# Calculate accuracy
accuracy = accuracy_score(y_test, y_pred)
accuracy
0.93
# Calculate precision
precision = precision_score(y_test, y_pred)
precision
0.8787878787878788
# Calculate recall
recall = recall_score(y_test, y_pred)
recall
0.90625
# Calculate F1-score
f1 = f1_score(y_test, y_pred)
f1
0.8923076923076922
91/100
0.91