State–action–reward–state–action (SARSA)

State%E2%80%93action%E2%80%93reward%E2%80%93state%E2%80%93action

SARSA is an algorithm used in reinforcement learning to learn a Markov decision process policy. It updates the Q-value based on the current state, action taken, reward received, next state and next action chosen. The acronym for this quintuple is SARSA. It was proposed by Rummery and Niranjan in a technical note.

2 courses cover this concept

CS 221 Artificial Intelligence: Principles and Techniques

Artificial Intelligence

Stanford University

Autumn 2022-2023

Stanford's CS 221 course teaches foundational principles and practical implementation of AI systems. It covers machine learning, game playing, constraint satisfaction, graphical models, and logic. A rigorous course requiring solid foundational skills in programming, math, and probability.

Histories of AI Societal impacts of AI Vector (mathematics and physics)Dot product Geometric interpretations Taking gradients Discrete random variables Probability distributions + 80 more concepts

CS 294-40: Learning for robotics and control

Robotics

UC Berkeley

Fall 2008

This advanced course focuses on the applications of machine learning in the robotics and control field. It covers a wide range of topics including Markov Decision Processes, control theories, estimation methodologies, and robotics principles. Recommended for graduate students.

Markov Decision Process (MDP)Contractions Asynchronous value iteration Linear–quadratic regulator (LQR)Differential dynamic programming (DDP)Quadruped locomotion + 21 more concepts