🚂 Flatland Examples
This repository contains simple examples to get started with the Flatland environment.
Sequential Agent
The sequential agent shows a very simple baseline that will move agents one after the other, taking them from their starting point to their target following the shortest path possible.
This is very inefficient but it solves all the instances generated by the sparse_level_generator
. However when being scored in the AIcrowd challenge, this agent will fail due to the duration it needs to solve an episode.
Here you see it in action:
Reinforcement Learning
The reinforcement learning agents show how to use a simple DQN algorithm implemented using PyTorch to solve small Flatland problems.
You can train agents in Colab for free!
# Single agent environment, 150 episodes
$ python reinforcement_learning/single_agent_training.py -n 150
# Multi-agent environment, 150 episodes
python reinforcement_learning/multi_agent_training.py -n 150
The multi-agent example can be tuned using command-line arguments:
usage: multi_agent_training.py [-h] [-n N_EPISODES] [--eps_start EPS_START]
[--eps_end EPS_END] [--eps_decay EPS_DECAY]
[--buffer_size BUFFER_SIZE]
[--buffer_min_size BUFFER_MIN_SIZE]
[--batch_size BATCH_SIZE] [--gamma GAMMA]
[--tau TAU] [--learning_rate LEARNING_RATE]
[--hidden_size HIDDEN_SIZE]
[--update_every UPDATE_EVERY]
[--use_gpu USE_GPU] [--num_threads NUM_THREADS]
[--render RENDER]
optional arguments:
-h, --help show this help message and exit
-n N_EPISODES, --n_episodes N_EPISODES
number of episodes to run
--eps_start EPS_START
max exploration
--eps_end EPS_END min exploration
--eps_decay EPS_DECAY
exploration decay
--buffer_size BUFFER_SIZE
replay buffer size
--buffer_min_size BUFFER_MIN_SIZE
min buffer size to start training
--batch_size BATCH_SIZE
minibatch size
--gamma GAMMA discount factor
--tau TAU soft update of target parameters
--learning_rate LEARNING_RATE
learning rate
--hidden_size HIDDEN_SIZE
hidden size (2 fc layers)
--update_every UPDATE_EVERY
how often to update the network
--use_gpu USE_GPU use GPU if available
--num_threads NUM_THREADS
number of threads to use
--render RENDER render 1 episode in 100
The single-agent example is meant as a minimal example of how to use DQN. The multi-agent is a better starting point to create your own solution!
📈 Results using the multi-agent example with various hyper-parameters
You can find more details about these examples in the documentation: