Precision Score

Precision Score is a metric used in classification tasks to measure how many of the positive predictions made by a model are actually correct. In simpler terms, it answers the question: Out of all the instances the model predicted as positive, how many were truly positive?

The formula for precision is:

${Precision} = \frac{\text{True Positives (TP)}}{\text{True Positives (TP)} + \text{False Positives (FP)}}$

Key Terms:¶

True Positives (TP): The number of correct positive predictions.
False Positives (FP): The number of incorrect positive predictions (predicted positive, but actually negative).

Example:¶

If a model predicts 10 positive cases, and 8 of those are correct while 2 are incorrect, the precision score would be:

${Precision} = \frac{8}{8 + 2} = \frac{8}{10} = 0.8 \text{ or } 80\%$

Usefulness:¶

Precision is particularly useful when the cost of false positives is high. For example, in spam detection, you want high precision to ensure that legitimate emails are not incorrectly marked as spam. It’s a good metric when you care more about the quality of positive predictions than the quantity.

In [1]:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

In [4]:

data = pd.read_csv('Social_Network_Ads.csv')
data

Out[4]:

	User ID	Gender	Age	EstimatedSalary	Purchased
0	15624510	Male	19.0	19000.0	0
1	15810944	Male	35.0	20000.0	0
2	15668575	Female	26.0	43000.0	0
3	15603246	Female	27.0	57000.0	0
4	15804002	Male	19.0	76000.0	0
…	…	…	…	…	…
395	15691863	Female	46.0	41000.0	1
396	15706071	Male	51.0	23000.0	1
397	15654296	Female	50.0	20000.0	1
398	15755018	Male	36.0	33000.0	0
399	15594041	Female	49.0	36000.0	1

400 rows × 5 columns

In [5]:

X = data.iloc[:, [2, 3]].values
y = data.iloc[:, 4].values

In [6]:

# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 0)

In [7]:

# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

In [8]:

from sklearn.neighbors import KNeighborsClassifier
classifier = KNeighborsClassifier(n_neighbors = 5, metric = 'minkowski', p = 2)
classifier.fit(X_train, y_train)

Out[8]:

KNeighborsClassifier()

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

In [9]:

y_pred = classifier.predict(X_test)
y_pred

Out[9]:

array([0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1,
       0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,
       1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1,
       0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1,
       1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1], dtype=int64)

In [10]:

# Making the Confusion Matrix
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
cm

Out[10]:

array([[64,  4],
       [ 3, 29]], dtype=int64)

In [11]:

# Visualising the Training set results
X_set, y_set = X_test, y_test
X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1, stop = X_set[:, 0].max() + 1, step = 0.01),
                     np.arange(start = X_set[:, 1].min() - 1, stop = X_set[:, 1].max() + 1, step = 0.01))

Z =classifier.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape)
plt.contourf(X1, X2, Z)
plt.scatter(X_set[y_set == 0, 0], X_set[y_set == 0, 1],label = 0)
plt.scatter(X_set[y_set == 1, 0], X_set[y_set == 1, 1],label = 1)

plt.title('K-Nearest Neighbors (K-NN) (Test set)')
plt.xlabel('Age')
plt.ylabel('Estimated Salary')
plt.legend()
plt.show()

No description has been provided for this image

In [12]:

from sklearn.metrics import precision_score

The confusion matrix for sklearn is as follows:

TN | FP
FN | TP

In [13]:

cm

Out[13]:

array([[64,  4],
       [ 3, 29]], dtype=int64)

In [14]:

# Calculate precision
precision = precision_score(y_test, y_pred)
precision

Out[14]:

0.8787878787878788

Machine Learning Tutorials, Courses and Certifications

Precision Score

Related Articles

Key Terms:¶

Example:¶

Usefulness:¶

Related

About Machine Learning

Check Also

Support Vector Classification

Multiple Linear Regression:

Microsoft AI Classroom Series Assessment Answers

Polynomial Regression

Support Vector Regression

Decision Tree Regression

BuyEssayFriend Review: What Are The Main Advantages Of This Writing Website?

Introduction to R

Functions

NLP Tokenizer

CNN – Practically

OpenCV Python Project for Bus Detection from an Image

OpenCV Python Project for Vehicle Detection From an Image

OpenCV Python Project for Vehicle Detection in a Video frame

Airline Quality Service

Airport Quality Service