From 58cfc5b9021b44c461e14ab309e29e4a28a6be8d Mon Sep 17 00:00:00 2001 From: Ritesh Date: Wed, 29 May 2024 10:34:20 +0530 Subject: [PATCH 1/5] Create Logistic_Regression.md --- .../machine-learning/Logistic_Regression.md | 121 ++++++++++++++++++ 1 file changed, 121 insertions(+) create mode 100644 contrib/machine-learning/Logistic_Regression.md diff --git a/contrib/machine-learning/Logistic_Regression.md b/contrib/machine-learning/Logistic_Regression.md new file mode 100644 index 0000000..3111550 --- /dev/null +++ b/contrib/machine-learning/Logistic_Regression.md @@ -0,0 +1,121 @@ +# Logistic Regression + +Logistic Regression is a statistical method used for binary classification problems. It is a type of regression analysis where the dependent variable is categorical. This README provides an overview of logistic regression, including its fundamental concepts, assumptions, and how to implement it using Python. + +## Table of Contents + +1. [Introduction](#introduction) +2. [Concepts](#concepts) +3. [Assumptions](#assumptions) +4. [Implementation](#implementation) + - [Using Scikit-learn](#using-scikit-learn) + - [Code Example](#code-example) +5. [Evaluation Metrics](#evaluation-metrics) +6. [Conclusion](#conclusion) +7. [References](#references) + +## Introduction + +Logistic Regression is used to model the probability of a binary outcome based on one or more predictor variables (features). It is widely used in various fields such as medical research, social sciences, and machine learning for tasks such as spam detection, fraud detection, and predicting user behavior. + +## Concepts + +### Sigmoid Function + +The logistic regression model uses the sigmoid function to map predicted values to probabilities. The sigmoid function is defined as: + +$$ +\sigma(z) = \frac{1}{1 + e^{-z}} +$$ + +Where \( z \) is a linear combination of the input features. + +### Odds and Log-Odds + +- **Odds**: The odds represent the ratio of the probability of an event occurring to the probability of it not occurring. + +$$\text{Odds} = \frac{P(Y=1)}{P(Y=0)}$$ + +- **Log-Odds**: The log-odds is the natural logarithm of the odds. + + $$\text{Log-Odds} = \log \left( \frac{P(Y=1)}{P(Y=0)} \right)$$ + +Logistic regression models the log-odds as a linear combination of the input features. + +### Model Equation + +The logistic regression model equation is: + +$$ +\log \left( \frac{P(Y=1)}{P(Y=0)} \right) = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \cdots + \beta_n X_n +$$ + +Where: +- β₀ is the intercept. +- βi are the coefficients for the predictor variables Xi. + + +## Assumptions + +1. **Linearity**: The log-odds of the response variable are a linear combination of the predictor variables. +2. **Independence**: Observations should be independent of each other. +3. **No Multicollinearity**: Predictor variables should not be highly correlated with each other. +4. **Large Sample Size**: Logistic regression requires a large sample size to provide reliable results. + +## Implementation + +### Using Scikit-learn + +Scikit-learn is a popular machine learning library in Python that provides tools for logistic regression. + +### Code Example + +```python +import numpy as np +import pandas as pd +from sklearn.model_selection import train_test_split +from sklearn.linear_model import LogisticRegression +from sklearn.metrics import accuracy_score, confusion_matrix, classification_report + +# Load dataset +data = pd.read_csv('path/to/your/dataset.csv') + +# Define features and target variable +X = data[['feature1', 'feature2', 'feature3']] +y = data['target'] + +# Split data into training and testing sets +X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) + +# Initialize and train logistic regression model +model = LogisticRegression() +model.fit(X_train, y_train) + +# Make predictions +y_pred = model.predict(X_test) + +# Evaluate the model +accuracy = accuracy_score(y_test, y_pred) +conf_matrix = confusion_matrix(y_test, y_pred) +class_report = classification_report(y_test, y_pred) + +print("Accuracy:", accuracy) +print("Confusion Matrix:\n", conf_matrix) +print("Classification Report:\n", class_report) +``` + +## Evaluation Metrics + +- **Accuracy**: The proportion of correctly classified instances among all instances. +- **Confusion Matrix**: A table showing the number of true positives, true negatives, false positives, and false negatives. +- **Precision, Recall, and F1-Score**: Metrics to evaluate the performance of the classification model. + +## Conclusion + +Logistic regression is a fundamental classification technique that is easy to implement and interpret. It is a powerful tool for binary classification problems and provides a probabilistic framework for predicting binary outcomes. + +## References + +- [Scikit-learn Documentation](https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression) +- [Wikipedia: Logistic Regression](https://en.wikipedia.org/wiki/Logistic_regression) +- [Towards Data Science: Understanding Logistic Regression](https://towardsdatascience.com/understanding-logistic-regression-9b02c2aec102) From a8c83302b7c7d063d98951799595c7c7b78050f5 Mon Sep 17 00:00:00 2001 From: Ritesh Date: Wed, 29 May 2024 10:35:42 +0530 Subject: [PATCH 2/5] Update index.md --- contrib/machine-learning/index.md | 1 + 1 file changed, 1 insertion(+) diff --git a/contrib/machine-learning/index.md b/contrib/machine-learning/index.md index 94ca1e2..0b9c2b0 100644 --- a/contrib/machine-learning/index.md +++ b/contrib/machine-learning/index.md @@ -9,3 +9,4 @@ - [TensorFlow.md](tensorFlow.md) - [PyTorch.md](pytorch.md) - [Types of optimizers](Types_of_optimizers.md) +- [Logistic Regression](Logistic_Regression.md) From d1bc3127082c3a27b7f2eb27a52fbb4b1013cc36 Mon Sep 17 00:00:00 2001 From: Ankit Mahato Date: Fri, 31 May 2024 06:21:12 +0530 Subject: [PATCH 3/5] Rename Logistic_Regression.md to logistic-regression.md --- .../{Logistic_Regression.md => logistic-regression.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename contrib/machine-learning/{Logistic_Regression.md => logistic-regression.md} (100%) diff --git a/contrib/machine-learning/Logistic_Regression.md b/contrib/machine-learning/logistic-regression.md similarity index 100% rename from contrib/machine-learning/Logistic_Regression.md rename to contrib/machine-learning/logistic-regression.md From ad01516cf8f834d7c4afbf907cd72320bc432f1e Mon Sep 17 00:00:00 2001 From: Ankit Mahato Date: Fri, 31 May 2024 06:21:28 +0530 Subject: [PATCH 4/5] Update index.md --- contrib/machine-learning/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/contrib/machine-learning/index.md b/contrib/machine-learning/index.md index 0b9c2b0..46100df 100644 --- a/contrib/machine-learning/index.md +++ b/contrib/machine-learning/index.md @@ -9,4 +9,4 @@ - [TensorFlow.md](tensorFlow.md) - [PyTorch.md](pytorch.md) - [Types of optimizers](Types_of_optimizers.md) -- [Logistic Regression](Logistic_Regression.md) +- [Logistic Regression](logistic-regression.md) From 70547423e3af49f0c65c7b444f8d2987239e87a5 Mon Sep 17 00:00:00 2001 From: Ankit Mahato Date: Fri, 31 May 2024 06:23:46 +0530 Subject: [PATCH 5/5] Update logistic-regression.md --- contrib/machine-learning/logistic-regression.md | 6 ------ 1 file changed, 6 deletions(-) diff --git a/contrib/machine-learning/logistic-regression.md b/contrib/machine-learning/logistic-regression.md index 3111550..2e45e98 100644 --- a/contrib/machine-learning/logistic-regression.md +++ b/contrib/machine-learning/logistic-regression.md @@ -113,9 +113,3 @@ print("Classification Report:\n", class_report) ## Conclusion Logistic regression is a fundamental classification technique that is easy to implement and interpret. It is a powerful tool for binary classification problems and provides a probabilistic framework for predicting binary outcomes. - -## References - -- [Scikit-learn Documentation](https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression) -- [Wikipedia: Logistic Regression](https://en.wikipedia.org/wiki/Logistic_regression) -- [Towards Data Science: Understanding Logistic Regression](https://towardsdatascience.com/understanding-logistic-regression-9b02c2aec102)