Updated README

da79652a · MasterScrat · acb3cda3 · da79652a
Commit da79652a authored 4 years ago by MasterScrat
--- a/README.md
+++ b/README.md
@@ -5,22 +5,28 @@ This starter kit contains 2 example policies to get started with this challenge:
 - a simple single-agent DQN method
 - a more robust multi-agent DQN method that you can submit out of the box to the challenge 🚀
- **[🔗 Train the single-agent DQN policy](https://flatland.aicrowd.com/getting-started/rl/single-agent.html)**
+**🔗 [Train the single-agent DQN policy](https://flatland.aicrowd.com/getting-started/rl/single-agent.html)**
- **[🔗 Train the multi-agent DQN policy](https://flatland.aicrowd.com/getting-started/rl/multi-agent.html)**
- **[🔗 Submit a trained multi-agent policy](https://flatland.aicrowd.com/getting-started/rl/single-agent.html)**
+**🔗 [Train the multi-agent DQN policy](https://flatland.aicrowd.com/getting-started/rl/multi-agent.html)**
+**🔗 [Submit a trained multi-agent policy](https://flatland.aicrowd.com/getting-started/rl/first-submission.html)**
 The single-agent example is meant as a minimal example of how to use DQN. The multi-agent is a better starting point to create your own solution.
 You can fully train the multi-agent policy in Colab for free! [![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1GbPwZNQU7KJIJtilcGBTtpOAD3EabAzJ?usp=sharing)
+Sample training usage
+---
+Train the multi-agent policy for 150 episodes:
 ```bash
-# Train the multi-agent policy for 150 episodes
 python reinforcement_learning/multi_agent_training.py -n 150
 ```
 The multi-agent policy training can be tuned using command-line arguments:
-```console
+```console 
 usage: multi_agent_training.py [-h] [-n N_EPISODES] [-t TRAINING_ENV_CONFIG]
                               [-e EVALUATION_ENV_CONFIG]
                               [--n_evaluation_episodes N_EVALUATION_EPISODES]
@@ -79,7 +85,7 @@ optional arguments:
  --render RENDER       render 1 episode in 100
 ```
-[**📈 Results using the multi-agent example with various hyper-parameters**](https://app.wandb.ai/masterscrat/flatland-examples-reinforcement_learning/reports/Flatland-Examples--VmlldzoxNDI2MTA)
+[**📈 Performance with various hyper-parameters**](https://app.wandb.ai/masterscrat/flatland-examples-reinforcement_learning/reports/Flatland-Examples--VmlldzoxNDI2MTA)
 [![](https://i.imgur.com/Lqrq5GE.png)](https://app.wandb.ai/masterscrat/flatland-examples-reinforcement_learning/reports/Flatland-Examples--VmlldzoxNDI2MTA)