@@ -56,6 +56,7 @@ For training purposes the tree is flattend into a single array.
...
@@ -56,6 +56,7 @@ For training purposes the tree is flattend into a single array.
## Training
## Training
### Setting up the environment
### Setting up the environment
Let us now train a simle double dueling DQN agent to navigate to its target on flatland. We start by importing flatland
Let us now train a simle double dueling DQN agent to navigate to its target on flatland. We start by importing flatland
```
```
from flatland.envs.generators import complex_rail_generator
from flatland.envs.generators import complex_rail_generator
from flatland.envs.observations import TreeObsForRailEnv
from flatland.envs.observations import TreeObsForRailEnv
...
@@ -63,7 +64,9 @@ from flatland.envs.rail_env import RailEnv
...
@@ -63,7 +64,9 @@ from flatland.envs.rail_env import RailEnv
from flatland.utils.rendertools import RenderTool
from flatland.utils.rendertools import RenderTool
from utils.observation_utils import norm_obs_clip, split_tree
from utils.observation_utils import norm_obs_clip, split_tree
```
```
For this simple example we want to train on randomly generated levels using the `complex_rail_generator`. We use the following parameter for our first experiment:
For this simple example we want to train on randomly generated levels using the `complex_rail_generator`. We use the following parameter for our first experiment:
```
```
# Parameters for the Environment
# Parameters for the Environment
x_dim = 10
x_dim = 10
...
@@ -72,13 +75,16 @@ n_agents = 1
...
@@ -72,13 +75,16 @@ n_agents = 1
n_goals = 5
n_goals = 5
min_dist = 5
min_dist = 5
```
```
As mentioned above, for this experiment we are going to use the tree observation and thus we load the observation builder:
As mentioned above, for this experiment we are going to use the tree observation and thus we load the observation builder:
```
```
# We are training an Agent using the Tree Observation with depth 2
# We are training an Agent using the Tree Observation with depth 2
# The action space of flatland is 5 discrete actions
# The action space of flatland is 5 discrete actions
action_size = 5
action_size = 5
```
```
In the `training_navigation.py` file you will finde further variable that we initiate in order to keep track of the training progress.
In the `training_navigation.py` file you will finde further variable that we initiate in order to keep track of the training progress.
Below you see an example code to train an agent. It is important to note that we reshape and normalize the tree observation provided by the environment to facilitate training.
Below you see an example code to train an agent. It is important to note that we reshape and normalize the tree observation provided by the environment to facilitate training.
To do so, we use the utility functions `split_tree(tree=np.array(obs[a]), num_features_per_node=features_per_node, current_depth=0)` and `norm_obs_clip()`. Feel free to modify the normalization as you see fit.
To do so, we use the utility functions `split_tree(tree=np.array(obs[a]), num_features_per_node=features_per_node, current_depth=0)` and `norm_obs_clip()`. Feel free to modify the normalization as you see fit.
```
```
# Split the observation tree into its parts and normalize the observation using the utility functions.
# Split the observation tree into its parts and normalize the observation using the utility functions.
# Build agent specific local observation
# Build agent specific local observation
...
@@ -121,6 +134,7 @@ To do so, we use the utility functions `split_tree(tree=np.array(obs[a]), num_fe
...
@@ -121,6 +134,7 @@ To do so, we use the utility functions `split_tree(tree=np.array(obs[a]), num_fe