F1 Score

F1-Score

The F1 Score is a metric used to evaluate the performance of a classification model by combining both precision and recall into a single score. It provides a balance between precision and recall, especially when there is an uneven class distribution or when both false positives and false negatives are important to consider.

The formula for the F1 score is the harmonic mean of precision and recall:

$F1 = 2 \times \frac{\text{Precision} \times \text{Recall}}{\text{Precision} + \text{Recall}}$

Example:¶

If a model has a precision of 0.75 (75%) and a recall of 0.60 (60%), the F1 score would be calculated as:

$F1 = 2 \times \frac{0.75 \times 0.60}{0.75 + 0.60} = 2 \times \frac{0.45}{1.35} \approx 0.67$

Usefulness:¶

The F1 Score is especially useful when there is an imbalance between positive and negative classes or when both precision and recall are important.
It helps balance the trade-off between identifying relevant instances (recall) and ensuring the predictions are correct (precision).

A high F1 score means the model performs well in both precision and recall.

In [1]:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

In [2]:

data = pd.read_csv('Social_Network_Ads.csv')
data

Out[2]:

	User ID	Gender	Age	EstimatedSalary	Purchased
0	15624510	Male	19.0	19000.0	0
1	15810944	Male	35.0	20000.0	0
2	15668575	Female	26.0	43000.0	0
3	15603246	Female	27.0	57000.0	0
4	15804002	Male	19.0	76000.0	0
…	…	…	…	…	…
395	15691863	Female	46.0	41000.0	1
396	15706071	Male	51.0	23000.0	1
397	15654296	Female	50.0	20000.0	1
398	15755018	Male	36.0	33000.0	0
399	15594041	Female	49.0	36000.0	1

400 rows × 5 columns

In [3]:

X = data.iloc[:, [2, 3]].values
y = data.iloc[:, 4].values

In [4]:

# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 0)

In [5]:

# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

In [6]:

from sklearn.neighbors import KNeighborsClassifier
classifier = KNeighborsClassifier(n_neighbors = 5, metric = 'minkowski', p = 2)
classifier.fit(X_train, y_train)

Out[6]:

KNeighborsClassifier()

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

In [7]:

y_pred = classifier.predict(X_test)
y_pred

Out[7]:

array([0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1,
       0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,
       1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1,
       0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1,
       1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1], dtype=int64)

In [8]:

# Making the Confusion Matrix
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
cm

Out[8]:

array([[64,  4],
       [ 3, 29]], dtype=int64)

In [9]:

# Visualising the Training set results
X_set, y_set = X_test, y_test
X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1, stop = X_set[:, 0].max() + 1, step = 0.01),
                     np.arange(start = X_set[:, 1].min() - 1, stop = X_set[:, 1].max() + 1, step = 0.01))

Z =classifier.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape)
plt.contourf(X1, X2, Z)
plt.scatter(X_set[y_set == 0, 0], X_set[y_set == 0, 1],label = 0)
plt.scatter(X_set[y_set == 1, 0], X_set[y_set == 1, 1],label = 1)

plt.title('K-Nearest Neighbors (K-NN) (Test set)')
plt.xlabel('Age')
plt.ylabel('Estimated Salary')
plt.legend()
plt.show()

No description has been provided for this image

In [10]:

from sklearn.metrics import f1_score

The confusion matrix for sklearn is as follows:

TN | FP
FN | TP

In [11]:

cm

Out[11]:

array([[64,  4],
       [ 3, 29]], dtype=int64)

In [12]:

# Calculate F1-score
f1 = f1_score(y_test, y_pred)
f1

Out[12]:

0.8923076923076924

In [13]:

91/100

Out[13]:

0.91

Machine Learning Tutorials, Courses and Certifications

F1 Score

Related Articles

Example:¶

Usefulness:¶

Related

About Machine Learning

Check Also

Support Vector Classification

Leave a Reply Cancel reply

OpenCV Python Project for Bus Detection from an Image

Multiple Linear Regression:

Microsoft AI Classroom Series Assessment Answers

Polynomial Regression

Support Vector Regression

Python Dictionary

FCF – Introduction to the Threat Landscape 2.0 Self-Paced Quiz Exam Answers

Inspect an Array

Data science with open Data Cognitive class Exam Answers:-

Digital Analytics & Regression Cognitive class Exam Answers:-

OpenCV Python Project for Bus Detection from an Image

OpenCV Python Project for Vehicle Detection From an Image

OpenCV Python Project for Vehicle Detection in a Video frame

Airline Quality Service

Airport Quality Service