Merge pull request #348 from Vrisha213/main

Added content: Confusion Matrix and SVM
2024-05-23 04:23:11 +05:30 · 2024-05-23 04:23:11 +05:30 · ae127de13d
commit ae127de13d
--- a/contrib/machine-learning/confusion-matrix.md
+++ b/contrib/machine-learning/confusion-matrix.md
@ -0,0 +1,70 @@
+## Confusion Matrix 
+
+A confusion matrix is a fundamental performance evaluation tool used in machine learning to assess the accuracy of a classification model. It is an N x N matrix, where N represents the number of target classes.
+
+For binary classification, it results in a 2 x 2 matrix that outlines four key parameters:
+1. True Positive (TP) - The predicted value matches the actual value, or the predicted class matches the actual class. 
+For example - the actual value was positive, and the model predicted a positive value.
+2. True Negative (TN) - The predicted value matches the actual value, or the predicted class matches the actual class. 
+For example - the actual value was negative, and the model predicted a negative value.
+3. False Positive (FP)/Type I Error - The predicted value was falsely predicted.
+For example - the actual value was negative, but the model predicted a positive value.
+4. False Negative (FN)/Type II Error - The predicted value was falsely predicted.
+For example - the actual value was positive, but the model predicted a negative value.
+
+The confusion matrix enables the calculation of various metrics like accuracy, precision, recall, F1-Score and specificity.
+1. Accuracy - It represents the proportion of correctly classified instances out of the total number of instances in the dataset.
+2. Precision - It quantifies the accuracy of positive predictions made by the model.
+3. Recall -  It quantifies the ability of a model to correctly identify all positive instances in the dataset and is also known as sensitivity or true positive rate.
+4. F1-Score - It is a single measure that combines precision and recall, offering a balanced evaluation of a classification model's effectiveness.
+
+To implement the confusion matrix in Python, we can use the confusion_matrix() function from the sklearn.metrics module of the scikit-learn library. 
+The function returns a 2D array that represents the confusion matrix.
+We can also visualize the confusion matrix using a heatmap.
+
+```python
+# Import necessary libraries
+import numpy as np
+from sklearn.metrics import confusion_matrix, classification_report
+import seaborn as sns
+import matplotlib.pyplot as plt 
+
+# Create the NumPy array for actual and predicted labels
+actual = np.array(['Apple', 'Apple', 'Apple', 'Not Apple', 'Apple',
+                   'Not Apple', 'Apple', 'Apple', 'Not Apple', 'Not Apple'])
+predicted = np.array(['Apple', 'Not Apple', 'Apple', 'Not Apple', 'Apple',
+                      'Apple', 'Apple', 'Apple', 'Not Apple', 'Not Apple'])
+
+# Compute the confusion matrix
+cm = confusion_matrix(actual,predicted)
+
+# Plot the confusion matrix with the help of the seaborn heatmap
+sns.heatmap(cm, 
+            annot=True,
+            fmt='g', 
+            xticklabels=['Apple', 'Not Apple'],
+            yticklabels=['Apple', 'Not Apple'])
+plt.xlabel('Prediction', fontsize=13)
+plt.ylabel('Actual', fontsize=13)
+plt.title('Confusion Matrix', fontsize=17)
+plt.show()
+
+# Classifications Report based on Confusion Metrics
+print(classification_report(actual, predicted))
+```
+
+### Results 
+
+```
+1. Confusion Matrix:
+[[5 1]
+[1 3]]
+2. Classification Report:
+              precision  recall   f1-score   support
+Apple           0.83      0.83      0.83         6
+Not Apple       0.75      0.75      0.75         4
+
+accuracy                            0.80        10
+macro avg       0.79      0.79      0.79        10
+weighted avg    0.80      0.80      0.80        10
+```
--- a/contrib/machine-learning/index.md
+++ b/contrib/machine-learning/index.md
@ -1,3 +1,5 @@
 # List of sections

 - [Regression in Machine Learning](Regression.md)
+- [Confusion Matrix](confusion-matrix.md)
+- [Support Vector Machine Algorithm](support-vector-machine.md)
--- a/contrib/machine-learning/support-vector-machine.md
+++ b/contrib/machine-learning/support-vector-machine.md
@ -0,0 +1,62 @@
+## Support Vector Machine
+
+Support Vector Machine or SVM is one of the most popular Supervised Learning algorithms, which is used for Classification as well as Regression problems. However, primarily, it is used for Classification problems in Machine Learning.
+
+SVM can be of two types -
+1. Linear SVM: Linear SVM is used for linearly separable data, which means if a dataset can be classified into two classes by using a single straight line, then such data is termed as linearly separable data, and classifier is used called as Linear SVM classifier.
+2. Non-linear SVM: Non-Linear SVM is used for non-linearly separated data, which means if a dataset cannot be classified by using a straight line, then such data is termed as non-linear data and classifier used is called as Non-linear SVM classifier.
+
+Working of SVM - The goal of SVM is to find a hyperplane that separates the data points into different classes. A hyperplane is a line in 2D space, a plane in 3D space, or a higher-dimensional surface in n-dimensional space. The hyperplane is chosen in such a way that it maximizes the margin, which is the distance between the hyperplane and the closest data points of each class. The closest data points are called the support vectors.
+
+The distance between the hyperplane and a data point "x" can be calculated using the formula −  
+```
+distance = (w . x + b) / ||w|| 
+```
+where "w" is the weight vector, "b" is the bias term, and "||w||" is the Euclidean norm of the weight vector. The weight vector "w" is perpendicular to the hyperplane and determines its orientation, while the bias term "b" determines its position.
+
+The optimal hyperplane is found by solving an optimization problem, which is to maximize the margin subject to the constraint that all data points are correctly classified. In other words, we want to find the hyperplane that maximizes the margin between the two classes while ensuring that no data point is misclassified. This is a convex optimization problem that can be solved using quadratic programming. If the data points are not linearly separable, we can use a technique called kernel trick to map the data points into a higher-dimensional space where they become separable. The kernel function computes the inner product between the mapped data points without computing the mapping itself. This allows us to work with the data points in the higherdimensional space without incurring the computational cost of mapping them.
+
+1. Hyperplane:
+There can be multiple lines/decision boundaries to segregate the classes in n-dimensional space, but we need to find out the best decision boundary that helps to classify the data points. This best boundary is known as the hyperplane of SVM.
+The dimensions of the hyperplane depend on the features present in the dataset, which means if there are 2 features, then hyperplane will be a straight line. And if there are 3 features, then hyperplane will be a 2-dimension plane. We always create a hyperplane that has a maximum margin, which means the maximum distance between the data points.
+2. Support Vectors:
+The data points or vectors that are the closest to the hyperplane and which affect the position of the hyperplane are termed as Support Vector. Since these vectors support the hyperplane, hence called a Support vector.
+3. Margin:
+It may be defined as the gap between two lines on the closet data points of different classes. It can be calculated as the perpendicular distance from the line to the support vectors. Large margin is considered as a good margin and small margin is considered as a bad margin.
+
+We will use the famous Iris dataset, which contains the sepal length, sepal width, petal length, and petal width of three species of iris flowers: Iris setosa, Iris versicolor, and Iris virginica. The goal is to classify the flowers into their respective species based on these four features. We load the iris dataset using load_iris and split the data into training and testing sets using train_test_split. We use a test size of 0.2, which means that 20% of the data will be used for testing and 80% for training. We set the random state to 42 to ensure reproducibility of the results.
+
+### Implemetation of SVM in Python 
+
+```python
+from sklearn.datasets import load_iris
+from sklearn.model_selection import train_test_split
+from sklearn.svm import SVC
+from sklearn.metrics import accuracy_score
+
+# load the iris dataset
+iris = load_iris()
+
+# split the data into training and testing sets
+X_train, X_test, y_train, y_test = train_test_split(iris.data,
+iris.target, test_size=0.2, random_state=42)
+
+# create an SVM classifier with a linear kernel
+svm = SVC(kernel='linear')
+
+# train the SVM classifier on the training set
+svm.fit(X_train, y_train)
+
+# make predictions on the testing set
+y_pred = svm.predict(X_test)
+
+# calculate the accuracy of the classifier
+accuracy = accuracy_score(y_test, y_pred)
+print("Accuracy:", accuracy)
+```
+
+#### Output
+```
+Accuracy: 1
+```
+