Grid Search¶
Grid search is a technique used in machine learning to systematically search for the best combination of hyperparameters for a given model. Hyperparameters are settings that are not learned from the data but are set by the user before training the model.
In grid search, you define a grid of possible values for each hyperparameter you want to tune. The grid represents all the possible combinations of hyperparameters. The grid search algorithm then evaluates the performance of the model using cross-validation for each combination of hyperparameters and selects the combination that yields the best performance metric.
Here’s a breakdown of the code with headings for each step:¶
1. Importing Required Libraries¶
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
2. Loading the Dataset¶
data = pd.read_csv('Social_Network_Ads.csv')
data
User ID | Gender | Age | EstimatedSalary | Purchased | |
---|---|---|---|---|---|
0 | 15624510 | Male | 19.0 | 19000.0 | 0 |
1 | 15810944 | Male | 35.0 | 20000.0 | 0 |
2 | 15668575 | Female | 26.0 | 43000.0 | 0 |
3 | 15603246 | Female | 27.0 | 57000.0 | 0 |
4 | 15804002 | Male | 19.0 | 76000.0 | 0 |
… | … | … | … | … | … |
395 | 15691863 | Female | 46.0 | 41000.0 | 1 |
396 | 15706071 | Male | 51.0 | 23000.0 | 1 |
397 | 15654296 | Female | 50.0 | 20000.0 | 1 |
398 | 15755018 | Male | 36.0 | 33000.0 | 0 |
399 | 15594041 | Female | 49.0 | 36000.0 | 1 |
400 rows × 5 columns
3. Defining Features and Target Variables¶
X = data.iloc[:, 2:4].values # Selecting columns 2 and 3 as features
y = data.iloc[:, 4].values # Selecting column 4 as the target variable
4. Splitting the Data into Training and Test Sets¶
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=0)
5. Feature Scaling¶
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
6. Building and Training the SVM Model¶
from sklearn.svm import SVC
model = SVC(kernel='rbf', random_state=0, C=1, gamma=1)
model.fit(X_train, y_train)
SVC(C=1, gamma=1, random_state=0)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
SVC(C=1, gamma=1, random_state=0)
7. Evaluating the Model on the Test Set¶
model.score(X_test, y_test) * 100
93.0
8. Setting Up Hyperparameter Grid for Grid Search¶
grid_param = {'C': [0.1, 1, 10], 'gamma': [0.01, 0.1, 1], 'kernel': ['linear', 'poly', 'sigmoid', 'rbf']}
9. Performing Grid Search for Hyperparameter Tuning¶
from sklearn.model_selection import GridSearchCV
gs = GridSearchCV(estimator=model, param_grid=grid_param)
gs.fit(X_train, y_train)
GridSearchCV(estimator=SVC(C=1, gamma=1, random_state=0), param_grid={'C': [0.1, 1, 10], 'gamma': [0.01, 0.1, 1], 'kernel': ['linear', 'poly', 'sigmoid', 'rbf']})In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
GridSearchCV(estimator=SVC(C=1, gamma=1, random_state=0), param_grid={'C': [0.1, 1, 10], 'gamma': [0.01, 0.1, 1], 'kernel': ['linear', 'poly', 'sigmoid', 'rbf']})
SVC(C=1, gamma=1, random_state=0)
SVC(C=1, gamma=1, random_state=0)
10. Displaying the Best Hyperparameters and Score from Grid Search¶
gs.best_params_ # Displaying the best hyperparameters
gs.best_score_ # Displaying the best score achieved with those hyperparameters
0.9099999999999999
Another Example: Iris Dataset¶
11. Loading the Iris Dataset¶
from sklearn.datasets import load_iris
iris = load_iris()
X = iris.data # Features
y = iris.target # Target classes
12. Building and Training the SVM Model on Iris Dataset¶
from sklearn.svm import SVC
model = SVC()
model.fit(X, y)
SVC()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
SVC()
13. Evaluating the Model on the Entire Iris Dataset¶
model.score(X, y) * 100
97.33333333333334
14. Setting Up Hyperparameter Grid for Grid Search (Iris Dataset)¶
grid_param = {'C': [0.1, 1, 10], 'gamma': [0.01, 0.1, 1], 'kernel': ['linear', 'poly', 'sigmoid', 'rbf']}
15. Performing Grid Search for Hyperparameter Tuning (Iris Dataset)¶
from sklearn.model_selection import GridSearchCV
gs = GridSearchCV(estimator=model, param_grid=grid_param)
gs.fit(X, y)
GridSearchCV(estimator=SVC(), param_grid={'C': [0.1, 1, 10], 'gamma': [0.01, 0.1, 1], 'kernel': ['linear', 'poly', 'sigmoid', 'rbf']})In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
GridSearchCV(estimator=SVC(), param_grid={'C': [0.1, 1, 10], 'gamma': [0.01, 0.1, 1], 'kernel': ['linear', 'poly', 'sigmoid', 'rbf']})
SVC(C=0.1, gamma=0.1, kernel='poly')
SVC(C=0.1, gamma=0.1, kernel='poly')
16. Displaying the Best Hyperparameters from Grid Search (Iris Dataset)¶
gs.best_params_ # Displaying the best hyperparameters for the Iris dataset
{'C': 0.1, 'gamma': 0.1, 'kernel': 'poly'}
17. Training a New SVM Model with the Best Hyperparameters¶
from sklearn.svm import SVC
model = SVC(C=0.1, gamma=0.1, kernel='poly')
model.fit(X, y)
SVC(C=0.1, gamma=0.1, kernel='poly')In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
SVC(C=0.1, gamma=0.1, kernel='poly')
18. Evaluating the Final Model on the Iris Dataset¶
model.score(X, y) * 100
98.0