Another part of our series called “Reinforcement Learning in practice”! AI playing games.
Today our experiment will be almost the same as last time - again learning directly from video output. This time simply trying different game - Boxing!
- Frozen Lake
- Lunar Lander
All the details explained HERE
Game looks like that:
Our agent controls participant in white color and of course it needs to beat his opponent (in black color). Whoever scores more points at the end of a 2 minute round or lands 100 punches (knockout) - wins!
As mentioned, today’s algorithm will be pretty much the same as the last one - so we will use Pytorch, Convolutional Neural Networks (CNN), and additionally DQN with improvements - Double DQN and Dueling DQN.
Info: X axis is number of episodes, Y axis is score (rewards received from environment per episode).
This game needs definitely more time then Pong so we waited only until we got satisfactory results (green IT - we save power :) ). In fact - even that took us much longer (> 6hours - again on single Nvidia GeForce GTX 1080 GPU). We stopped after around 1100th episode.
As we can see from rewards graph game is - lets say - not very difficult (at least at some basic level) because at the beginning, when agent was using mostly random moves, score was positive (meaning agent was winning rounds). Then after first learning period, when agent started to use his “intelligence” more and more, “small crisis” came and results got definitely worse (reaching even negative 30 points around 300th episode).
Finally our agent learnt how to use his “virtual hands” and after that it kept having significant progress.
Eventually it was reaching slightly over 60 points and average (total) was over 20 points (23.6) which - again according to Deepmind’s paper - is much better than human level (4.3 points).
Below - agent’s performance captured on video:
Like it? Want to do similar things?
Have a look at our course! We teach everything from scratch!