What is Grid Search?¶
Grid Search is a hyperparameter tuning technique used in machine learning to optimize model performance by systematically searching through a specified set of hyperparameter values. Hyperparameters are settings that govern the training process of a model, and finding the right combination can significantly impact the model’s accuracy and effectiveness.
Key Concepts¶
Hyperparameters:
- These are parameters that are set before the training process begins. Unlike model parameters (like weights in a neural network), hyperparameters control aspects of the training process itself.
- Examples include the learning rate, number of trees in ensemble methods, maximum depth of trees, etc.
Parameter Grid:
- A grid is defined by listing hyperparameters and the values you want to try. For example, you might specify a range of values for the learning rate and maximum depth.
- Each combination of hyperparameters forms a unique configuration that the model will be trained on.
Cross-Validation:
- Grid Search often incorporates cross-validation, a technique that splits the training dataset into several subsets (folds). The model is trained on a subset and validated on the remaining data, which helps in assessing its performance more reliably.
- The most common method used is k-fold cross-validation, where the dataset is divided into
k
subsets, and the model is trained and validatedk
times, each time using a different subset for validation.
How Grid Search Works¶
Define the Model: Choose the machine learning model you want to optimize (e.g., SVM, XGBoost, Random Forest).
Set Hyperparameter Grid: Specify the hyperparameters and their corresponding values to test. For instance:
param_grid = { 'n_estimators': [50, 100, 200], 'max_depth': [3, 5, 7], 'learning_rate': [0.01, 0.1, 1] }
Model Training: For each combination of hyperparameters, the model is trained and evaluated using cross-validation.
Evaluate Performance: The performance metrics (like accuracy, precision, recall) are recorded for each combination.
Select Best Hyperparameters: After evaluating all combinations, the hyperparameters that resulted in the best performance are selected.
Benefits of Grid Search¶
- Exhaustive Search: It ensures that all possible combinations of specified hyperparameters are considered, which can lead to finding the optimal configuration.
- Simplicity: The method is straightforward to implement and understand.
- Integrates with Cross-Validation: By incorporating cross-validation, Grid Search provides a robust evaluation of model performance.
Drawbacks of Grid Search¶
- Computationally Intensive: If the parameter grid is large, it can lead to long training times as the model is trained multiple times for each combination.
- Curse of Dimensionality: As the number of hyperparameters increases, the grid grows exponentially, making it less feasible for high-dimensional hyperparameter spaces.
Summary¶
Grid Search is a powerful and widely used technique for hyperparameter tuning in machine learning. It systematically evaluates different combinations of hyperparameters to find the best configuration for a given model, helping improve its predictive performance. However, due to its exhaustive nature, it can be computationally expensive, especially with a large parameter grid. For more efficient hyperparameter optimization, alternative methods like Random Search or Bayesian Optimization may also be considered.
Let’s review Practically¶
Grid Search using K-Fold¶
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
data = pd.read_csv('Social_Network_Ads.csv')
data.head()
User ID | Gender | Age | EstimatedSalary | Purchased | |
---|---|---|---|---|---|
0 | 15624510 | Male | 19.0 | 19000.0 | 0 |
1 | 15810944 | Male | 35.0 | 20000.0 | 0 |
2 | 15668575 | Female | 26.0 | 43000.0 | 0 |
3 | 15603246 | Female | 27.0 | 57000.0 | 0 |
4 | 15804002 | Male | 19.0 | 76000.0 | 0 |
X = data.iloc[:,2:4].values
y = data.iloc[:,4].values
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.25,random_state=0)
plt.scatter(X_train[...,0],X_train[...,1])
plt.show()
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
plt.scatter(X_train[...,0],X_train[...,1])
plt.show()
from sklearn.svm import SVC
classifier = SVC(kernel='rbf',random_state=0)
classifier.fit(X_train,y_train)
SVC(random_state=0)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
SVC(random_state=0)
from sklearn.model_selection import cross_val_score
accuracies = cross_val_score(estimator=classifier,X = X_train,y = y_train, cv = 10)
accuracies
array([0.8 , 0.96666667, 0.8 , 0.96666667, 0.86666667, 0.86666667, 0.9 , 0.93333333, 1. , 0.93333333])
accuracies.mean()
0.9033333333333333
accuracies.std()
0.06574360974438671
from sklearn.model_selection import GridSearchCV
parameters = [{'C':[1,10,100,1000],'kernel':['linear']},
{'C':[1,10,100,1000],'kernel':['sigmoid']},
{'C':[0.7,0.8,0.9,1.0,1.1,1.2],'kernel':['rbf'],
'gamma':[1,2,3,4,5,6,7],'degree':[1,2,3,4,5]}]
grid_search = GridSearchCV(estimator=classifier,param_grid = parameters,scoring='accuracy',cv=10)
grid_search = grid_search.fit(X_train,y_train)
grid_search.best_score_
0.9133333333333334
grid_search.best_params_
{'C': 0.7, 'degree': 1, 'gamma': 3, 'kernel': 'rbf'}
from sklearn.svm import SVC
classifier = SVC(kernel='rbf',random_state=0,C=0.7,gamma=3,degree=1)
classifier.fit(X_train,y_train)
SVC(C=0.7, degree=1, gamma=3, random_state=0)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
SVC(C=0.7, degree=1, gamma=3, random_state=0)
classifier.score(X_test,y_test) * 100
93.0
y_pred = classifier.predict(X_test)
y_pred
array([0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1], dtype=int64)
y_test
array([0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1], dtype=int64)
# Making the Confusion Matrix
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
cm
array([[64, 4], [ 3, 29]], dtype=int64)