Fall 2008

UC Berkeley

This advanced course focuses on the applications of machine learning in the robotics and control field. It covers a wide range of topics including Markov Decision Processes, control theories, estimation methodologies, and robotics principles. Recommended for graduate students.

This is an advanced course in learning for robotics and control. The goal of this course is to help the audience with their research in learning for robotics and control or related topics. A tentative list of topics includes:

- Markov decision processes: value iteration, policy iteration, linear programming, Q learning, TD, value function approximation, inverse reinforcement learning
- Control: linear quadratic regulator, differential dynamic programming, receding horizon / model predictive control
- Estimation: (extended) Kalman filters, particle filters, SLAM
- Robotics: basic principles of various robots, sensors, microcontrollers
- Exploration/Exploitation: bandits, no-regret, e^3

Familiarity with mathematical proofs, machine learning, artificial intelligence, optimization, probability, algorithms, linear algebra; ability to implement algorithmic ideas in code (C/C++ and matlab).

Graduate students only (consent of instructor required for undergraduate students, please talk to me after first lecture and hand me summary of relevant classes/experience so I can decide whether to make an exception).

No data.

No data

Lecture notes available at Syllabus

No videos available

Problem sets available at Problem sets

Related materials available at Related materials

Asynchronous value iterationBanditsContractionsDifferential dynamic programming (DDP)Dynamic programmingDynamics ModelingExploration / ExploitationFunction approximationInverse reinforcement learningKalman FilteringLearning to walkLinear Programming ApproachLinearly-solvable Markov decision problemsLinear–quadratic regulator (LQR)Markov Decision Process (MDP)Model Predictive ControlPartially observable Markov decision process (POMDP)Policy GradientPolicy IterationQ-learningQuadruped locomotionReward ShapingSeparation PrincipleSimultaneous Localization and Mapping (SLAM)State–action–reward–state–action (SARSA)TD-GammonTemporal difference (TD) learning