From 5671b25a9f0341d9f783fef63d5427475dbafa45 Mon Sep 17 00:00:00 2001
From: Kosuri Indu <kosurilindu@gmail.com>
Date: Fri, 28 Jun 2024 07:43:37 +0530
Subject: [PATCH] Made changes

---
 contrib/machine-learning/gradient-descent.md | 31 +++++++-------------
 1 file changed, 10 insertions(+), 21 deletions(-)

diff --git a/contrib/machine-learning/gradient-descent.md b/contrib/machine-learning/gradient-descent.md
index bedb398..fa6da66 100644
--- a/contrib/machine-learning/gradient-descent.md
+++ b/contrib/machine-learning/gradient-descent.md
@@ -19,32 +19,21 @@ The core idea of Gradient Descent is to move in the direction of the steepest de
 
 ### Mathematical Formulation
 
-For a parameter \( \theta \):
-
-\[ \theta := \theta - \alpha \frac{\partial J(\theta)}{\partial \theta} \]
+For a parameter θ: `θ := θ − α(∂θ/∂J(θ))`​
 
 Where:
-- \( \theta \) is the parameter.
-- \( \alpha \) is the learning rate.
-- \( J(\theta) \) is the cost function.
+- `θ` is the parameter.
+- `α` is the learning rate.
+- `J(θ)` is the cost function.
 
 ## Hyperparameters
 
-### Learning Rate (\( \alpha \))
-
-The learning rate determines the size of the steps taken towards the minimum. 
-
-### Number of Iterations
-
-This is the number of times the algorithm will update the parameters.
-
-### Batch Size
-
-In batch gradient descent, the entire dataset is used to compute the gradient. In stochastic gradient descent, each iteration uses a single data point. Mini-batch gradient descent uses a subset of data points.
-
-### Regularization Parameter
-
-This parameter is used to prevent overfitting by adding a penalty to the cost function based on the size of the parameters.
+| Hyperparameter          | Description                                                                                     |
+|-------------------------|-------------------------------------------------------------------------------------------------|
+| Learning Rate `α` | Determines the size of the steps taken towards the minimum.                                       |
+| Number of Iterations    | Number of times the algorithm will update the parameters.                                         |
+| Batch Size              | In batch gradient descent, the entire dataset is used. In stochastic gradient descent, each iteration uses a single data point. Mini-batch gradient descent uses a subset of data points. |
+| Regularization Parameter| Prevents overfitting by adding a penalty to the cost function based on the size of the parameters.|
 
 ## Advantages and Disadvantages