Another part of our series called “Reinforcement Learning in practice”! AI playing games.
Today we will try another category of OpenAI Gym’s games - so called “Box2D” - this time it will be “Lunar Lander”.
- Frozen Lake
- Lunar Lander
All the details explained HERE
Game looks like that:
We recommend having a look at game description (as there are important details) but mainly we have to land in the pad between two flags and best would be not to use main engine too much (as it gives negative points).
As in previous example we use DQN with improvements - Double DQN and Dueling DQN. And as always our favourite Pytorch.
Info: X axis is number of episodes, Y axis is score (rewards received from environment per episode).
Again - similar as with CartPole - our agent quite quickly managed to solve the game (reach at least 200 points) and later kept consistent performance (in this case with few individual exceptions).
Average reward for last 100 episodes was 212.72 (and was over 200 already after exactly 293 episodes) - general average reward was 129.83 - including this terrible (from performance point of view) period at the beginning.
We stopped training at 500 episodes.
As side note - we could definitely notice great performance of Dueling DQN here. One of its main benefits is that our agent learns NOT to use some actions (lets say it learns to be patient) which is very important in this particular game (because as mentioned for using main engine we get negative points). Same algorithm, with same parameters but with Dueling disabled had significantly worse results.
Below - agent’s performance (with Dueling DQN enabled) captured on video:
Like it? Want to do similar things?
Have a look at our course! We teach everything from scratch!