cm3020 Topic 03: Reinforcement Learning (Game Playing)
Main Info
Title: Reinforcement Learning (Game Playing)
Teachers: Matthew Yee King
Semester Taken: October 2021
Parent Module: cm3020 Artificial Intelligence
Description
Digs into the DQN paper and related work in deep reinforcement learning and game playing.
Lecture Summaries
Can be found in cm3020 Lecture Summaries: Topic 03: Reinforcement Learning
Lab Summaries
First lab is in week 13, it gets you up and running with the OpenAI gym locally or on their VM. You experiment with the gym, exploring the states, actions, and overall framework for running experiments in the gym.
Second lab, again in week 13, walks through setting up keras, and then using a neural network to take the state as input and output an action, continuous or discrete. By the end you have a loop going where you use the NN to generate actions, and the gym to act and generate a new state.
Final lab, in week 14, has you train the DQN agent, and run pre-trained models in the gym.
Assigned Reading
Week 11: History
Strachey: Logical or non-mathematical programmes (not accessible)
Mnih et al: Human-level control through deep reinforcement learning (The DQN paper)
Brown and Sandholm: Superhuman AI for heads-up no-limit poker
Raiman et al: Long-Term Planning and Situational Awareness in OpenAI Five
Badia et al: Agent57: Outperforming the Atari Human Benchmark
Not on the reading list but in the lectures:
Grace et al: When Will AI Exceed Human Performance? Evidence from AI Experts
Week 12: Formalism
Mnih et al: Human-level control through deep reinforcement learning (repeat)
Mnih et al: Playing Atari wtih deep reinforcement learning (original paper from 2013)
The Markov decision process toolbox has implementations of value function approximation techniques.
Week 13: Tooling
Week 15: State of the Art
Togelius: The Mario AI Championship (repeat)
Arulkumaran: AlphaStar: An Evolutionary Computation Perspective
Canaan: Leveling the Playing Field: Fairness in AI versus Human Benchmarks
Kapturowski: Recurrent Experience Replay in Distributed Reinforcement Learning
Badia et al: Agent57: Outperforming the Atari Human Benchmark (repeat)
Other Resources
The Open AI gym
The DQN implementation starter
The same implementation docs
AlphaGo - The Movie Recommended by MYK.