Support Vector Regression

Support Vector Regression’

1. What is Support Vector Regression?¶

SVR is a type of machine learning method used to predict continuous values (like prices) based on input features (like size, number of rooms). It’s based on the concept of support vectors, which help in defining the best prediction model.

2. Basic Idea of SVR¶

Instead of trying to find the best line (like in linear regression), SVR tries to find a “tube” around a line that can fit most of the data points within a certain margin (called epsilon, ε). The goal is to minimize the errors while keeping the model simple.

3. How SVR Works¶

Epsilon Tube: Imagine a tube around the prediction line. Data points that fall inside this tube are not considered errors. Only points outside this tube are counted as errors.
Support Vectors: The data points that lie outside the tube are called support vectors. These points are crucial because they influence the position of the tube and the prediction line.

4. Collect Your Data¶

Gather the dataset you want to work with. For example, if you’re predicting house prices:
- Features might include size, number of bedrooms, and age of the house.
- The target variable is the price of the house.

5. Split the Data¶

Divide your data into:
- Training set: For training the model.
- Test set: For checking how well the model works on new data.

6. Choose the Kernel Function¶

SVR can use different types of kernel functions to transform the input data. Common kernels include:
- Linear: Straight line; good for simple relationships.
- Polynomial: Curved line; captures more complex relationships.
- Radial Basis Function (RBF): A flexible function that can fit many shapes.

7. Train the SVR Model¶

Use a machine learning library to create and train the SVR model with your training data.

from sklearn.svm import SVR

# Create the SVR model
model = SVR(kernel='rbf')  # Use RBF kernel for flexibility
model.fit(X_train, Y_train)  # Train the model

8. Make Predictions¶

Use the trained model to make predictions on your test set or new data.

Y_pred = model.predict(X_test)

9. Evaluate the Model¶

Check how well your model performed by comparing the predicted values to the actual values using metrics like:
- Mean Squared Error (MSE): Shows how far off your predictions are.
- R-squared (R²): Indicates how well the model explains the variability of the target variable.

from sklearn.metrics import mean_squared_error, r2_score

mse = mean_squared_error(Y_test, Y_pred)
r2 = r2_score(Y_test, Y_pred)

print("Mean Squared Error:", mse)
print("R-squared:", r2)

10. Analyze Errors¶

Look at the errors (differences between predicted and actual values). This helps you see if your model is performing well or if it needs adjustments.

11. Use the Model for Future Predictions¶

After validating your model, you can use it to predict values for new inputs.

new_data = [[size, bedrooms, age]]  # Example new data
predictions = model.predict(new_data)

12. Conclusion¶

Summarize how well the SVR model performed and discuss the results. Mention how the support vectors influenced the predictions.

Let’s review example step by step.¶

Import the Libraries

In [1]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

Make a list or Read Data

In [2]:

l = [[1,45],[2,51],[3,60],[4,80],[5,110],[6,150],[7,200],[8,240]]
l

Out[2]:

[[1, 45], [2, 51], [3, 60], [4, 80], [5, 110], [6, 150], [7, 200], [8, 240]]

In [3]:

df = pd.DataFrame(l,columns=['x','y'])
df

Out[3]:

	x	y
0	1	45
1	2	51
2	3	60
3	4	80
4	5	110
5	6	150
6	7	200
7	8	240

In [4]:

x = df.iloc[:,:1].values
x

Out[4]:

array([[1],
       [2],
       [3],
       [4],
       [5],
       [6],
       [7],
       [8]], dtype=int64)

In [5]:

y = df.iloc[:,1].values
y

Out[5]:

array([ 45,  51,  60,  80, 110, 150, 200, 240], dtype=int64)

Plot scatter x and y

In [6]:

plt.scatter(x,y)
plt.show()

No description has been provided for this image

Put Algorithm

In [8]:

from sklearn.svm import SVR
reg = SVR(kernel='rbf')
reg.fit(x,y)

Out[8]:

SVR()

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Predict y

In [9]:

y_pred = reg.predict(x)
y_pred

Out[9]:

array([92.58372724, 92.117258  , 92.58298256, 94.04747182, 95.95252818,
       97.41701744, 97.882742  , 97.41627276])

In [10]:

Out[10]:

array([ 45,  51,  60,  80, 110, 150, 200, 240], dtype=int64)

Plot scatter x and y

Plot line x and y predict

In [11]:

plt.scatter(x,y)
plt.plot(x,y_pred)
plt.show()

Check accuracy

In [12]:

reg.score(x,y)*100

Out[12]:

-4.342009418562265

In [13]:

reg.predict([[6.5]])

Out[13]:

array([97.78455023])

Standardization¶

In [15]:

Out[15]:

array([[1],
       [2],
       [3],
       [4],
       [5],
       [6],
       [7],
       [8]], dtype=int64)

In [16]:

Out[16]:

array([ 45,  51,  60,  80, 110, 150, 200, 240], dtype=int64)

Plot scatter x and y

In [17]:

plt.scatter(x,y)
plt.show()

Put Algorithm

We use a standard scaler to normalize features by removing the mean and scaling them to unit variance, which helps improve the performance of many machine learning algorithms.¶

In [18]:

from sklearn.preprocessing import StandardScaler
sc_x = StandardScaler()
sc_y = StandardScaler()

In [19]:

X = sc_x.fit_transform(x)
X

Out[19]:

array([[-1.52752523],
       [-1.09108945],
       [-0.65465367],
       [-0.21821789],
       [ 0.21821789],
       [ 0.65465367],
       [ 1.09108945],
       [ 1.52752523]])

In [20]:

Y = sc_y.fit_transform(y.reshape(-1,1)).reshape(-1)
Y

Out[20]:

array([-1.05424509, -0.96639133, -0.83461069, -0.54176484, -0.10249605,
        0.48319567,  1.21531031,  1.80100203])

Plot scatter x and y

In [21]:

plt.scatter(X,Y)
plt.show()

Put algorithm

In [22]:

from sklearn.svm import SVR
reg = SVR(kernel='rbf')
reg.fit(X,Y)

Out[22]:

SVR()

Predict y

In [23]:

y_pred = reg.predict(X)
y_pred

Out[23]:

array([-0.95434326, -0.93171546, -0.73438987, -0.44178931, -0.00517628,
        0.58239451,  1.11521213,  1.28275184])

Plot scatter x and y

Plot line x and y predict

In [24]:

plt.scatter(X,Y)
plt.plot(X,y_pred)
plt.show()

Predict future value of y

In [25]:

val = sc_x.transform([[6.5]])
val

Out[25]:

array([[0.87287156]])

In [26]:

val = reg.predict(val)
val

Out[26]:

array([0.87621202])

In [27]:

val = sc_y.inverse_transform([val])
val

Out[27]:

array([[176.84117556]])

In [28]:

round(val[0][0])

Out[28]:

In [ ]:

Support Vector Regression

Related Articles

1. What is Support Vector Regression?¶

2. Basic Idea of SVR¶

3. How SVR Works¶

4. Collect Your Data¶

5. Split the Data¶

6. Choose the Kernel Function¶

7. Train the SVR Model¶

8. Make Predictions¶

9. Evaluate the Model¶

10. Analyze Errors¶

11. Use the Model for Future Predictions¶

12. Conclusion¶

Let’s review example step by step.¶

Import the Libraries

Make a list or Read Data

Plot scatter x and y

Put Algorithm

Predict y

Plot scatter x and y

Plot line x and y predict

Check accuracy

Standardization¶

Plot scatter x and y

Put Algorithm

We use a standard scaler to normalize features by removing the mean and scaling them to unit variance, which helps improve the performance of many machine learning algorithms.¶

Plot scatter x and y

Put algorithm

Predict y

Plot scatter x and y

Plot line x and y predict

Predict future value of y

Related

About Machine Learning

Check Also

Leave a Reply Cancel reply