From 5cf4b9b5cbd88c094427cffff5a4881d3cc48118 Mon Sep 17 00:00:00 2001
From: Ojaswi Chopra <ojaswichopra06@gmail.com>
Date: Sat, 22 Jun 2024 23:01:52 +0530
Subject: [PATCH] Trial-2

---
 contrib/machine-learning/reinforcement-learning.md | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/contrib/machine-learning/reinforcement-learning.md b/contrib/machine-learning/reinforcement-learning.md
index 47c350b..760d530 100644
--- a/contrib/machine-learning/reinforcement-learning.md
+++ b/contrib/machine-learning/reinforcement-learning.md
@@ -300,11 +300,11 @@ Congratulations on completing your journey through this comprehensive guide to r
 
 *Happy coding, and may your RL adventures be rewarding!*
 
-\( Q(s, a) \leftarrow Q(s, a) + \alpha \left( r + \gamma \max_{a'} Q(s', a') - Q(s, a) \right) \)
+$$ Q(s, a) \leftarrow Q(s, a) + \alpha \left( r + \gamma \max_{a'} Q(s', a') - Q(s, a) \right) $$
 
 where:
-- \( Q(s, a) \) is the Q-value of state \( s \) and action \( a \).
-- \( r \) is the observed reward.
-- \( s' \) is the next state.
-- \( \alpha \) is the learning rate.
-- \( \gamma \) is the discount factor.
+- $Q(s, a)$ is the Q-value of state $s$ and action $a$.
+- $r$ is the observed reward.
+- $s'$ is the next state.
+- $\alpha$ is the learning rate.
+- $\gamma$ is the discount factor.