Monday , May 20 2024

Naive Bayes Classification

Naive Bayes is a simple yet effective supervised machine learning algorithm commonly used for classification tasks. In this example, I’ll provide a step-by-step guide for implementing Naive Bayes classification in Python using Scikit-Learn. There are different variants of Naive Bayes, such as Gaussian Naive Bayes for continuous data and Multinomial Naive Bayes for text data. I’ll demonstrate the Gaussian Naive Bayes for simplicity:

Step 1: Import Libraries

import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import classification_report, confusion_matrix

Step 2: Prepare Your Data
Ensure your dataset contains features (X) and the corresponding target labels (y). Make sure your data is in a NumPy array or a DataFrame.

Step 3: Split Data into Training and Testing Sets
Split your data into training and testing sets to evaluate the model’s performance.

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

Step 4: Create the Naive Bayes Classifier (Gaussian Naive Bayes)

classifier = GaussianNB()

Step 5: Train the Naive Bayes Classifier

classifier.fit(X_train, y_train)

Step 6: Make Predictions

y_pred = classifier.predict(X_test)

Step 7: Evaluate the Model
Evaluate the model’s performance using classification metrics such as accuracy, precision, recall, F1-score, and the confusion matrix.

from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score

accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
f1 = f1_score(y_test, y_pred)

print(f'Accuracy: {accuracy}')
print(f'Precision: {precision}')
print(f'Recall: {recall}')
print(f'F1-Score: {f1}')

confusion = confusion_matrix(y_test, y_pred)
print('Confusion Matrix:')
print(confusion)

Step 8: Visualize Results (Optional)
Depending on the number of features in your dataset, you can visualize the decision boundary to understand how the Naive Bayes classifier separates different classes.

# Example visualization for a two-feature dataset
plt.scatter(X_test[y_test == 0][:, 0], X_test[y_test == 0][:, 1], color='red', label='Class 0')
plt.scatter(X_test[y_test == 1][:, 0], X_test[y_test == 1][:, 1], color='blue', label='Class 1')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.title('Gaussian Naive Bayes Classifier')
plt.legend()
plt.show()

Naive Bayes is particularly useful for text classification tasks, such as spam detection and sentiment analysis, but it can also be applied to other types of data with suitable preprocessing.

About Machine Learning

Check Also

K Nearest Neighbor Classification – KNN

K-Nearest Neighbors (KNN) is a supervised machine learning algorithm used for classification and regression tasks. …

Leave a Reply

Your email address will not be published. Required fields are marked *