Adversarial Learning and Imitation
Learn about the generative adversarial imitation learning algorithm.
We'll cover the following
Given a set of expert observations (of a driver or a champion video game player), we would like to find a reward function that assigns a high reward to an agent that matches expert behavior and a low reward to agents that do not. At the same time, we want to choose such an agent’s policy,
GAIL
In the following, instead of a reward function, we use a cost function to match the conventions used in the referenced literature on this topic, but it is just the negative of the reward function.
Putting these constraints together, we
Get hands-on with 1400+ tech skills courses.