Polynomial Regression

1. What is Polynomial Regression?¶

Polynomial Regression is a type of regression analysis that models the relationship between a dependent variable (like house price) and one or more independent variables (like size) as an nth degree polynomial. It helps capture relationships that are not linear (straight lines).

2. Why Use Polynomial Regression?¶

Sometimes, the relationship between the variables is curved, not straight. Polynomial regression can fit curves to the data, making it more flexible than simple linear regression.

3. The Polynomial Equation¶

The equation looks like this: [ Y = \beta_0 + \beta_1X + \beta_2X^2 + \beta_3X^3 + … + \beta_nX^n ] Where:
- ( Y ): The target variable you want to predict (like price).
- ( X ): The independent variable (like size).
- ( \beta_0 ): The intercept (the starting value of Y when X is zero).
- ( \beta_1, \beta_2, … ): Coefficients that show how much Y changes as X changes, with different powers of X.

4. Collect Your Data¶

Gather the dataset you want to analyze. For example, if you’re predicting house prices:
- Features might include size (square feet), number of bedrooms, etc.
- The target variable is the price of the house.

5. Split the Data¶

Divide your dataset into:
- Training set: For training the polynomial regression model.
- Test set: For checking how well the model performs on new data.

6. Prepare the Data¶

To use polynomial regression, you need to create polynomial features from your independent variables. This means you’ll generate new features that are powers of the original features.

from sklearn.preprocessing import PolynomialFeatures

poly = PolynomialFeatures(degree=2)  # Change degree for more curves
X_poly = poly.fit_transform(X_train)  # Create polynomial features for training data

7. Train the Polynomial Regression Model¶

Use a machine learning library to create and train the polynomial regression model with your training data.

from sklearn.linear_model import LinearRegression

model = LinearRegression()
model.fit(X_poly, Y_train)  # Train the model with polynomial features

8. Make Predictions¶

For predictions, you also need to create polynomial features for your test set or new data.

X_test_poly = poly.transform(X_test)  # Transform test data
Y_pred = model.predict(X_test_poly)  # Make predictions

9. Evaluate the Model¶

Check how well your model performed by comparing the predicted values to the actual values using metrics like:
- Mean Squared Error (MSE): Measures the average of the squares of the errors (how far off your predictions are).
- R-squared (R²): Indicates how well the model explains the variability in the target variable.

from sklearn.metrics import mean_squared_error, r2_score

mse = mean_squared_error(Y_test, Y_pred)
r2 = r2_score(Y_test, Y_pred)

print("Mean Squared Error:", mse)
print("R-squared:", r2)

10. Visualize the Results (Optional)¶

You can visualize the polynomial regression curve to see how well it fits the data. This helps to understand the model better.

import matplotlib.pyplot as plt
import numpy as np

# Create a range of values for X for plotting
X_range = np.linspace(min(X_train), max(X_train), 100).reshape(-1, 1)
X_range_poly = poly.transform(X_range)  # Transform the range for plotting
Y_range_pred = model.predict(X_range_poly)  # Predictions for the range

plt.scatter(X_train, Y_train, color='blue')  # Original data points
plt.plot(X_range, Y_range_pred, color='red')  # Polynomial curve
plt.title('Polynomial Regression')
plt.xlabel('Size')
plt.ylabel('Price')
plt.show()

11. Conclusion¶

Summarize how well the polynomial regression model performed and discuss the shape of the curve. Mention how polynomial regression can capture more complex relationships compared to linear regression.

Let’s review example step by step.¶

Polynomial Regression¶

Import the Libraries

In [1]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

Make a list or Read Data

In [2]:

l = [[1,45],[2,51],[3,60],[4,80],[5,110],[6,150],[7,200],[8,240]]
l

Out[2]:

[[1, 45], [2, 51], [3, 60], [4, 80], [5, 110], [6, 150], [7, 200], [8, 240]]

Covert List into DataFrame

In [3]:

df = pd.DataFrame(l,columns=['x','y'])
df

Out[3]:

	x	y
0	1	45
1	2	51
2	3	60
3	4	80
4	5	110
5	6	150
6	7	200
7	8	240

Put the value of x

In [4]:

x = df.iloc[:,:1].values
x

Out[4]:

array([[1],
       [2],
       [3],
       [4],
       [5],
       [6],
       [7],
       [8]], dtype=int64)

Put the value of y

In [5]:

y = df.iloc[:,1].values
y

Out[5]:

array([ 45,  51,  60,  80, 110, 150, 200, 240], dtype=int64)

Plot scatter x and y

In [6]:

plt.scatter(x,y)
plt.show()

No description has been provided for this image

Put algorithm

In [7]:

from sklearn.linear_model import LinearRegression
reg = LinearRegression()
reg.fit(x,y)

Out[7]:

LinearRegression()

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Predict y

In [8]:

y_pred = reg.predict(x)
y_pred

Out[8]:

array([ 16.58333333,  45.27380952,  73.96428571, 102.6547619 ,
       131.3452381 , 160.03571429, 188.72619048, 217.41666667])

In [9]:

Out[9]:

array([ 45,  51,  60,  80, 110, 150, 200, 240], dtype=int64)

Plot scatter x and y

Plot line x and y predict

In [10]:

plt.scatter(x,y)
plt.plot(x,y_pred)
plt.show()

Check Accuracy

In [11]:

reg.score(x,y)*100

Out[11]:

92.65161550496813

check the future value of y

In [12]:

reg.predict([[2]])

Out[12]:

array([45.27380952])

Using Polynomial Regression¶

In [13]:

Out[13]:

array([[1],
       [2],
       [3],
       [4],
       [5],
       [6],
       [7],
       [8]], dtype=int64)

In [14]:

Out[14]:

array([ 45,  51,  60,  80, 110, 150, 200, 240], dtype=int64)

Put Algorithm

In [15]:

from sklearn.preprocessing import PolynomialFeatures
poly = PolynomialFeatures(degree=2)
X = poly.fit_transform(x)
X

Out[15]:

array([[ 1.,  1.,  1.],
       [ 1.,  2.,  4.],
       [ 1.,  3.,  9.],
       [ 1.,  4., 16.],
       [ 1.,  5., 25.],
       [ 1.,  6., 36.],
       [ 1.,  7., 49.],
       [ 1.,  8., 64.]])

Put Algorithm

In [16]:

from sklearn.linear_model import LinearRegression
reg = LinearRegression()
reg.fit(X,y)

Out[16]:

LinearRegression()

Predict y

In [17]:

y_pred = reg.predict(X)
y_pred

Out[17]:

array([ 44.33333333,  49.23809524,  62.07142857,  82.83333333,
       111.52380952, 148.14285714, 192.69047619, 245.16666667])

In [18]:

Out[18]:

array([ 45,  51,  60,  80, 110, 150, 200, 240], dtype=int64)

Plot scatter x and y

Plot line x and y predict

In [19]:

plt.scatter(x,y)
plt.plot(x,y_pred)
plt.show()

Check accuracy

In [20]:

reg.score(X,y)*100

Out[20]:

99.72728224054805

check the future value of y

In [21]:

val = poly.transform([[6.5]])
val

Out[21]:

array([[ 1.  ,  6.5 , 42.25]])

In [22]:

reg.predict(val)

Out[22]:

array([169.42559524])

Polynomial Regression

Related Articles

1. What is Polynomial Regression?¶

2. Why Use Polynomial Regression?¶

3. The Polynomial Equation¶

4. Collect Your Data¶

5. Split the Data¶

6. Prepare the Data¶

7. Train the Polynomial Regression Model¶

8. Make Predictions¶

9. Evaluate the Model¶

10. Visualize the Results (Optional)¶

11. Conclusion¶

Let’s review example step by step.¶

Polynomial Regression¶

Import the Libraries

Make a list or Read Data

Covert List into DataFrame

Put the value of x

Put the value of y

Plot scatter x and y

Put algorithm

Predict y

Plot scatter x and y

Plot line x and y predict

Check Accuracy

check the future value of y

Using Polynomial Regression¶

Put Algorithm

Put Algorithm

Predict y

Plot scatter x and y

Plot line x and y predict

Check accuracy

check the future value of y

Related

About Machine Learning

Check Also

Leave a Reply Cancel reply