Unverified Commit d6abd9ce authored by Léonard Blier's avatar Léonard Blier Committed by GitHub
Browse files

Merge pull request #29 from jilljenn/master

Update README.md
parents d349a580 4210af6e
......@@ -72,7 +72,7 @@ recurrent model, first-order information such as the velocity of the car is abse
We reproduced the paper "World Models" on the CarRacing environment, and made some additional experiments. Overall, our conclusions are twofold:
* The results were easy to reproduce. It probably means that the method on this problem does not only achieve high perforance but is also very stable. This is an important remark for a deep reinforcement learning method.
* The results were easy to reproduce. It probably means that the method on this problem does not only achieve high performance but is also very stable. This is an important remark for a deep reinforcement learning method.
* On the CarRacing-v0 environment, it seems that the recurrent network only serves as a recurrent reservoir, enabling access to crucial higher order information, such as velocity or acceleration. This observation needs some perspective, it comes with several interrogations and remarks:
* (Ha et al. 2018) reports good results when training in the simulated environment on the VizDoom task. Without a trained recurrent forward model, we cannot expect to obtain such performance.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment