From 194fc52150204544790c525234ad7f21387c4b59 Mon Sep 17 00:00:00 2001 From: Ojaswi Chopra <99067527+ojaswichopra@users.noreply.github.com> Date: Sat, 22 Jun 2024 23:06:47 +0530 Subject: [PATCH] Update reinforcement-learning.md --- contrib/machine-learning/reinforcement-learning.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/contrib/machine-learning/reinforcement-learning.md b/contrib/machine-learning/reinforcement-learning.md index fcc26d6..027b682 100644 --- a/contrib/machine-learning/reinforcement-learning.md +++ b/contrib/machine-learning/reinforcement-learning.md @@ -113,7 +113,7 @@ Q-Learning is a model-free algorithm used in reinforcement learning to learn the - Choose an action using an exploration strategy (e.g., epsilon-greedy). - Take the action, observe the reward and the next state. - Update the Q-value of the current state-action pair using the Bellman equation: - $$ Q(s, a) \leftarrow Q(s, a) + \alpha \left( r + \gamma \max_{a'} Q(s', a') - Q(s, a) \right) $$ + $$Q(s, a) \leftarrow Q(s, a) + \alpha \left( r + \gamma \max_{a'} Q(s', a') - Q(s, a) \right)$$ where: - $Q(s, a)$ is the Q-value of state $s$ and action $a$.