The following script is the final result of the project for the Deep Learning Lecture. It uses a trained VAE and MDN-RNN to generate a "dreamed" sequence of frames.
The length of the teaching sequence can be choosen by setting the argument --teaching_duration and the length of the successive "dreamed" sequence by setting --dream_duration.
If this script is runned on the leonhard cluster the following lines will load the correct modules and install the requirements before starting the generate_dream script:
Paper: Ha and Schmidhuber, "World Models", 2018. https://doi.org/10.5281/zenodo.1207631. For a quick summary of the paper and some additional experiments, visit the [github page](https://ctallec.github.io/world-models/).
## Prerequisites
First clone the project files from Gitlab (you will need to enter your credentials):
Navigate to the project directory and execute the following command to build the Docker image. This might take a while, but you only need to do this once.
```bash
docker build -t deep-learning:worldmodels .
```
To run the container, run the following command in the project directory, depending on your OS:
```bash
Windows (PowerShell): docker run -it--rm-v${pwd}:/app deep-learning:worldmodels
Linux: docker run -it--rm-v$(pwd):/app deep-learning:worldmodels
```
## Running the worldmodels
To run the model, run the Docker container (see above) and execute the command inside the container.
The model is composed of three parts:
1. A Variational Auto-Encoder (VAE), whose task is to compress the input images into a compact latent representation.
2. A Mixture-Density Recurrent Network (MDN-RNN), trained to predict the latent encoding of the next frame given past latent encodings and actions.
3. A linear Controller (C), which takes both the latent encoding of the current frame, and the hidden state of the MDN-RNN given past latents and actions as input and outputs an action. It is trained to maximize the cumulated reward using the Covariance-Matrix Adaptation Evolution-Strategy ([CMA-ES](http://www.cmap.polytechnique.fr/~nikolaus.hansen/cmaartic.pdf)) from the `cma` python package.
In the given code, all three sections are trained separately, using the scripts `trainvae.py`, `trainmdrnn.py` and `traincontroller.py`.
Training scripts take as argument:
***--logdir** : The directory in which the models will be stored. If the logdir specified already exists, it loads the old model and continues the training.
***--noreload** : If you want to override a model in *logdir* instead of reloading it, add this option.
### 1. Data generation
Before launching the VAE and MDN-RNN training scripts, you need to generate a dataset of random rollouts and place it in the `datasets/carracing` folder.
Data generation is handled through the `data/generation_script.py` script, e.g.
Rollouts are generated using a *brownian* random policy, instead of the *white noise* random `action_space.sample()` policy from gym, providing more consistent rollouts.
### 2. Training the VAE
The VAE is trained using the `trainvae.py` file, e.g.
```bash
python trainvae.py --logdir exp_dir
```
### 3. Training the MDN-RNN
The MDN-RNN is trained using the `trainmdrnn.py` file, e.g.
```bash
python trainmdrnn.py --logdir exp_dir
```
A VAE must have been trained in the same `exp_dir` for this script to work.
### 4. Training and testing the Controller
Finally, the controller is trained using CMA-ES, e.g.