From 3364f1d353ec2ad096a4d603688c1c7f482dcb28 Mon Sep 17 00:00:00 2001 From: Ritesh Date: Mon, 10 Jun 2024 18:39:49 +0530 Subject: [PATCH 1/3] Create polynomial-regression.md --- .../machine-learning/polynomial-regression.md | 114 ++++++++++++++++++ 1 file changed, 114 insertions(+) create mode 100644 contrib/machine-learning/polynomial-regression.md diff --git a/contrib/machine-learning/polynomial-regression.md b/contrib/machine-learning/polynomial-regression.md new file mode 100644 index 0000000..3f1287e --- /dev/null +++ b/contrib/machine-learning/polynomial-regression.md @@ -0,0 +1,114 @@ +# Polynomial Regression + +Polynomial Regression is a form of regression analysis in which the relationship between the independent variable \( x \) and the dependent variable \( y \) is modeled as an \( n \)th degree polynomial. This README provides an overview of polynomial regression, including its fundamental concepts, assumptions, and how to implement it using Python. + +## Table of Contents + +1. [Introduction](#introduction) +2. [Concepts](#concepts) +3. [Assumptions](#assumptions) +4. [Implementation](#implementation) + - [Using Scikit-learn](#using-scikit-learn) + - [Code Example](#code-example) +5. [Evaluation Metrics](#evaluation-metrics) +6. [Conclusion](#conclusion) +7. [References](#references) + +## Introduction + +Polynomial Regression is used when the data shows a non-linear relationship between the independent variable \( x \) and the dependent variable \( y \). It extends the simple linear regression model by considering polynomial terms of the independent variable, allowing for a more flexible fit to the data. + +## Concepts + +### Polynomial Equation + +The polynomial regression model is based on the following polynomial equation: + +$$ +\[ y = \beta_0 + \beta_1 x + \beta_2 x^2 + \beta_3 x^3 + \cdots + \beta_n x^n + \epsilon \] +$$ + +Where: +- \( y \) is the dependent variable. +- \( x \) is the independent variable. +- \( \beta_0, \beta_1, \ldots, \beta_n \) are the coefficients of the polynomial. +- \( \epsilon \) is the error term. + +### Degree of Polynomial + +The degree of the polynomial (n) determines the flexibility of the model. A higher degree allows the model to fit more complex, non-linear relationships, but it also increases the risk of overfitting. + +### Overfitting and Underfitting + +- **Overfitting**: When the model fits the noise in the training data too closely, resulting in poor generalization to new data. +- **Underfitting**: When the model is too simple to capture the underlying pattern in the data. + +## Assumptions + +1. **Independence**: Observations are independent of each other. +2. **Homoscedasticity**: The variance of the residuals (errors) is constant across all levels of the independent variable. +3. **Normality**: The residuals of the model are normally distributed. +4. **No Multicollinearity**: The predictor variables are not highly correlated with each other. + +## Implementation + +### Using Scikit-learn + +Scikit-learn is a popular machine learning library in Python that provides tools for polynomial regression. + +### Code Example + +```python +import numpy as np +import pandas as pd +import matplotlib.pyplot as plt +from sklearn.preprocessing import PolynomialFeatures +from sklearn.linear_model import LinearRegression +from sklearn.metrics import mean_squared_error, r2_score + +# Load dataset +data = pd.read_csv('path/to/your/dataset.csv') + +# Define features and target variable +X = data[['feature']] +y = data['target'] + +# Transform features to polynomial features +poly = PolynomialFeatures(degree=3) +X_poly = poly.fit_transform(X) + +# Initialize and train polynomial regression model +model = LinearRegression() +model.fit(X_poly, y) + +# Make predictions +y_pred = model.predict(X_poly) + +# Evaluate the model +mse = mean_squared_error(y, y_pred) +r2 = r2_score(y, y_pred) +print("Mean Squared Error:", mse) +print("R^2 Score:", r2) + +# Visualize the results +plt.scatter(X, y, color='blue') +plt.plot(X, y_pred, color='red') +plt.xlabel('Feature') +plt.ylabel('Target') +plt.title('Polynomial Regression') +plt.show() +``` + +## Evaluation Metrics + +- **Mean Squared Error (MSE)**: The average of the squared differences between actual and predicted values. +- **R-squared (R²) Score**: A statistical measure that represents the proportion of the variance for the dependent variable that is explained by the independent variables in the model. + +## Conclusion + +Polynomial Regression is a powerful tool for modeling non-linear relationships between variables. It is important to choose the degree of the polynomial carefully to balance between underfitting and overfitting. Understanding and properly evaluating the model using appropriate metrics ensures its effectiveness. + +## References + +- [Scikit-learn Documentation](https://scikit-learn.org/stable/modules/linear_model.html#polynomial-regression) +- [Wikipedia: Polynomial Regression](https://en.wikipedia.org/wiki/Polynomial_reg From b4aaf5587d714e5b925cde85626ab970e858edf0 Mon Sep 17 00:00:00 2001 From: Ritesh Date: Mon, 10 Jun 2024 18:40:58 +0530 Subject: [PATCH 2/3] Update index.md --- contrib/machine-learning/index.md | 1 + 1 file changed, 1 insertion(+) diff --git a/contrib/machine-learning/index.md b/contrib/machine-learning/index.md index 0bf39f9..c0e7368 100644 --- a/contrib/machine-learning/index.md +++ b/contrib/machine-learning/index.md @@ -20,3 +20,4 @@ - [Grid Search](grid-search.md) - [Transformers](transformers.md) - [K-nearest neighbor (KNN)](knn.md) +- [Polynomial Regression](polynomial-regression.md) From c8dffec853ce55b690c56d95fd3a008b371c7553 Mon Sep 17 00:00:00 2001 From: Ashita Prasad Date: Sat, 22 Jun 2024 16:17:30 +0530 Subject: [PATCH 3/3] Update polynomial-regression.md --- .../machine-learning/polynomial-regression.md | 26 +++++-------------- 1 file changed, 7 insertions(+), 19 deletions(-) diff --git a/contrib/machine-learning/polynomial-regression.md b/contrib/machine-learning/polynomial-regression.md index 3f1287e..d00ede3 100644 --- a/contrib/machine-learning/polynomial-regression.md +++ b/contrib/machine-learning/polynomial-regression.md @@ -1,22 +1,10 @@ # Polynomial Regression -Polynomial Regression is a form of regression analysis in which the relationship between the independent variable \( x \) and the dependent variable \( y \) is modeled as an \( n \)th degree polynomial. This README provides an overview of polynomial regression, including its fundamental concepts, assumptions, and how to implement it using Python. - -## Table of Contents - -1. [Introduction](#introduction) -2. [Concepts](#concepts) -3. [Assumptions](#assumptions) -4. [Implementation](#implementation) - - [Using Scikit-learn](#using-scikit-learn) - - [Code Example](#code-example) -5. [Evaluation Metrics](#evaluation-metrics) -6. [Conclusion](#conclusion) -7. [References](#references) +Polynomial Regression is a form of regression analysis in which the relationship between the independent variable $x$ and the dependent variable $y$ is modeled as an $nth$ degree polynomial. This guide provides an overview of polynomial regression, including its fundamental concepts, assumptions, and how to implement it using Python. ## Introduction -Polynomial Regression is used when the data shows a non-linear relationship between the independent variable \( x \) and the dependent variable \( y \). It extends the simple linear regression model by considering polynomial terms of the independent variable, allowing for a more flexible fit to the data. +Polynomial Regression is used when the data shows a non-linear relationship between the independent variable $x$ and the dependent variable $y$ is modeled as an $nth$ degree polynomial. It extends the simple linear regression model by considering polynomial terms of the independent variable, allowing for a more flexible fit to the data. ## Concepts @@ -29,10 +17,10 @@ $$ $$ Where: -- \( y \) is the dependent variable. -- \( x \) is the independent variable. -- \( \beta_0, \beta_1, \ldots, \beta_n \) are the coefficients of the polynomial. -- \( \epsilon \) is the error term. +- $y$ is the dependent variable. +- $x$ is the independent variable. +- $\beta_0, \beta_1, \ldots, \beta_n$ are the coefficients of the polynomial. +- $\epsilon$ is the error term. ### Degree of Polynomial @@ -111,4 +99,4 @@ Polynomial Regression is a powerful tool for modeling non-linear relationships b ## References - [Scikit-learn Documentation](https://scikit-learn.org/stable/modules/linear_model.html#polynomial-regression) -- [Wikipedia: Polynomial Regression](https://en.wikipedia.org/wiki/Polynomial_reg +- [Wikipedia: Polynomial Regression](https://en.wikipedia.org/wiki/Polynomial_reg)