Grid Search

What is Grid Search?¶

Grid Search is a hyperparameter tuning technique used in machine learning to optimize model performance by systematically searching through a specified set of hyperparameter values. Hyperparameters are settings that govern the training process of a model, and finding the right combination can significantly impact the model’s accuracy and effectiveness.

Key Concepts¶

Hyperparameters:
- These are parameters that are set before the training process begins. Unlike model parameters (like weights in a neural network), hyperparameters control aspects of the training process itself.
- Examples include the learning rate, number of trees in ensemble methods, maximum depth of trees, etc.
Parameter Grid:
- A grid is defined by listing hyperparameters and the values you want to try. For example, you might specify a range of values for the learning rate and maximum depth.
- Each combination of hyperparameters forms a unique configuration that the model will be trained on.
Cross-Validation:
- Grid Search often incorporates cross-validation, a technique that splits the training dataset into several subsets (folds). The model is trained on a subset and validated on the remaining data, which helps in assessing its performance more reliably.
- The most common method used is k-fold cross-validation, where the dataset is divided into k subsets, and the model is trained and validated k times, each time using a different subset for validation.

How Grid Search Works¶

Define the Model: Choose the machine learning model you want to optimize (e.g., SVM, XGBoost, Random Forest).

Set Hyperparameter Grid: Specify the hyperparameters and their corresponding values to test. For instance:

param_grid = {
    'n_estimators': [50, 100, 200],
    'max_depth': [3, 5, 7],
    'learning_rate': [0.01, 0.1, 1]
}

Model Training: For each combination of hyperparameters, the model is trained and evaluated using cross-validation.
Evaluate Performance: The performance metrics (like accuracy, precision, recall) are recorded for each combination.
Select Best Hyperparameters: After evaluating all combinations, the hyperparameters that resulted in the best performance are selected.

Benefits of Grid Search¶

Exhaustive Search: It ensures that all possible combinations of specified hyperparameters are considered, which can lead to finding the optimal configuration.
Simplicity: The method is straightforward to implement and understand.
Integrates with Cross-Validation: By incorporating cross-validation, Grid Search provides a robust evaluation of model performance.

Drawbacks of Grid Search¶

Computationally Intensive: If the parameter grid is large, it can lead to long training times as the model is trained multiple times for each combination.
Curse of Dimensionality: As the number of hyperparameters increases, the grid grows exponentially, making it less feasible for high-dimensional hyperparameter spaces.

Summary¶

Grid Search is a powerful and widely used technique for hyperparameter tuning in machine learning. It systematically evaluates different combinations of hyperparameters to find the best configuration for a given model, helping improve its predictive performance. However, due to its exhaustive nature, it can be computationally expensive, especially with a large parameter grid. For more efficient hyperparameter optimization, alternative methods like Random Search or Bayesian Optimization may also be considered.

Let’s review Practically¶

Grid Search using K-Fold¶

In [1]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [3]:

data = pd.read_csv('Social_Network_Ads.csv')
data.head()

Out[3]:

	User ID	Gender	Age	EstimatedSalary
0	15624510	Male	19.0	19000.0
1	15810944	Male	35.0	20000.0
2	15668575	Female	26.0	43000.0
3	15603246	Female	27.0	57000.0
4	15804002	Male	19.0	76000.0

In [4]:

X = data.iloc[:,2:4].values
y = data.iloc[:,4].values

In [5]:

from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.25,random_state=0)

In [6]:

plt.scatter(X_train[...,0],X_train[...,1])
plt.show()

No description has been provided for this image

In [7]:

from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test  = sc.transform(X_test)

In [8]:

plt.scatter(X_train[...,0],X_train[...,1])
plt.show()

In [9]:

from sklearn.svm import SVC
classifier = SVC(kernel='rbf',random_state=0)
classifier.fit(X_train,y_train)

Out[9]:

SVC(random_state=0)

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

In [10]:

from sklearn.model_selection import cross_val_score
accuracies = cross_val_score(estimator=classifier,X = X_train,y = y_train, cv = 10)

In [11]:

accuracies

Out[11]:

array([0.8       , 0.96666667, 0.8       , 0.96666667, 0.86666667,
       0.86666667, 0.9       , 0.93333333, 1.        , 0.93333333])

In [12]:

accuracies.mean()

Out[12]:

0.9033333333333333

In [13]:

accuracies.std()

Out[13]:

0.06574360974438671

In [14]:

from sklearn.model_selection import GridSearchCV
parameters = [{'C':[1,10,100,1000],'kernel':['linear']},
              {'C':[1,10,100,1000],'kernel':['sigmoid']},
              {'C':[0.7,0.8,0.9,1.0,1.1,1.2],'kernel':['rbf'],
               'gamma':[1,2,3,4,5,6,7],'degree':[1,2,3,4,5]}]

In [15]:

grid_search = GridSearchCV(estimator=classifier,param_grid = parameters,scoring='accuracy',cv=10)
grid_search = grid_search.fit(X_train,y_train)

In [16]:

grid_search.best_score_

Out[16]:

0.9133333333333334

In [17]:

grid_search.best_params_

Out[17]:

{'C': 0.7, 'degree': 1, 'gamma': 3, 'kernel': 'rbf'}

In [18]:

from sklearn.svm import SVC
classifier = SVC(kernel='rbf',random_state=0,C=0.7,gamma=3,degree=1)
classifier.fit(X_train,y_train)

Out[18]:

SVC(C=0.7, degree=1, gamma=3, random_state=0)

In [19]:

classifier.score(X_test,y_test) * 100

Out[19]:

93.0

In [20]:

y_pred = classifier.predict(X_test)
y_pred

Out[20]:

array([0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1,
       0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,
       1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1,
       0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1,
       1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1], dtype=int64)

In [21]:

y_test

Out[21]:

array([0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1,
       0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,
       1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1,
       0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1,
       1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1], dtype=int64)

In [22]:

# Making the Confusion Matrix
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
cm

Out[22]:

array([[64,  4],
       [ 3, 29]], dtype=int64)

Machine Learning Tutorials, Courses and Certifications

Grid Search

Related Articles

What is Grid Search?¶

Key Concepts¶

How Grid Search Works¶

Benefits of Grid Search¶

Drawbacks of Grid Search¶

Summary¶

Let’s review Practically¶

Grid Search using K-Fold¶

Related

About Machine Learning

Check Also

Introduction to XGBoost Classifier

Leave a Reply Cancel reply

OpenCV Python Project for Bus Detection from an Image

Multiple Linear Regression:

Microsoft AI Classroom Series Assessment Answers

Polynomial Regression

Support Vector Regression

Apache Pig 101 cognitive class Exam Answers:-

Security-Driven Networking Quiz Answers – NSE 3 Fortinet

Security Operations Quiz Answers – NSE 3 Fortinet

Jupyter Notebook Introductory Session

Grid Search Theoretical

OpenCV Python Project for Bus Detection from an Image

OpenCV Python Project for Vehicle Detection From an Image

OpenCV Python Project for Vehicle Detection in a Video frame

Airline Quality Service

Airport Quality Service