Skip to content

Instantly share code, notes, and snippets.

@qgallouedec
Last active May 9, 2020 14:30
Show Gist options
  • Select an option

  • Save qgallouedec/55cd74db3f060525ae6251e1e5300c1b to your computer and use it in GitHub Desktop.

Select an option

Save qgallouedec/55cd74db3f060525ae6251e1e5300c1b to your computer and use it in GitHub Desktop.
# \delta_t = R_{t+1} + \gamma * Q(S_{t+1}, A_{t+1}) - Q(S_{t}, A_{t})
delta_t = reward + gamma* Q[next_state][next_action] - Q[state][action]
# Add delta_t to the current value function
# Q(S_t, A_t) += alpha * \delta_t
Q[state][action] += alpha * delta_t
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment