Commit d34e70d2 authored by manuschn's avatar manuschn

Merge branch 'combined-obs-from-config' into cleanup

parents f067fa67 cc7efbc6
# Combined Observation
```{admonition} TL;DR
This observation allows to combine multiple observation by specifying them int the fun config.
```
### 💡 The idea
Provide a simple way to combine multiple observation.
### 🗂️ Files and usage
The observation is defined in "neurips2020-flatland-baselines/envs/flatland/observations/combined_obs.py".
To combine multiple observations, instead of directly putting the observation settings under "observation_config", use the names of the observations you want to combine as keys to provide the corresponding observation configs (see example).
An example config is located in "neurips2020-flatland-baselines/baselines/global_density_obs/sparse_small_apex_maxdepth2_spmaxdepth30.yaml" and can be run with
`python ./train.py -f baselines/global_density_obs/sparse_small_apex_maxdepth2_spmaxdepth30.yaml`
### 📦 Implementation Details
This observation does not generate itself any information for the agent but just naively concat the outputs of the specified observations.
### 📈 Results
Since this observations is meant as a helper to easily explore combinations of observations, there is no meaningful baseline. However, we did a run combining tree and local conflict observations as a sanity check (see link below).
### 🔗 Links
* [Ape-X Paper – Distributed Prioritized Experience Replay (Horgan et al.)](https://arxiv.org/abs/1803.00933)
* [W&B report for test run](https://app.wandb.ai/masterscrat/flatland/reports/Tree-and-Conflict-Obs-|-sparse-small_v0--VmlldzoxNTc4MzU)
### 🌟 Credits
flatland-sparse-small-density-cnn-apex:
run: APEX
env: flatland_sparse
stop:
timesteps_total: 15000000 # 1.5e7
checkpoint_freq: 10
checkpoint_at_end: True
keep_checkpoints_num: 5
checkpoint_score_attr: episode_reward_mean
num_samples: 3
config:
num_workers: 13
num_envs_per_worker: 5
num_gpus: 0
env_config:
observation: combined
observation_config:
tree:
max_depth: 2
shortest_path_max_depth: 30
localConflict:
max_depth: 2
shortest_path_max_depth: 30
n_local: 5
generator: sparse_rail_generator
generator_config: small_v0
resolve_deadlocks: False
deadlock_reward: 0
density_reward_factor: 0
wandb:
project: flatland
entity: masterscrat
tags: ["small_v0", "tree_and_local_conflict", "apex"] # TODO should be set programmatically
model:
fcnet_activation: relu
fcnet_hiddens: [256, 256]
vf_share_layers: True
evaluation_num_workers: 2
evaluation_interval: 100
evaluation_num_episodes: 100
evaluation_config:
explore: False
env_config:
observation: density
observation_config:
width: 25
height: 25
max_t: 1000
encoding: exp_decay
regenerate_rail_on_reset: True
regenerate_schedule_on_reset: True
render: False
import gym
from flatland.core.env_observation_builder import ObservationBuilder
from typing import Optional, List
from envs.flatland.observations import Observation, register_obs, make_obs
@register_obs("combined")
class CombinedObservation(Observation):
def __init__(self, config) -> None:
super().__init__(config)
self._observations = [
make_obs(obs_name, config[obs_name]) for obs_name in config.keys()
]
self._builder = CombinedObsForRailEnv([
o._builder for o in self._observations
])
def builder(self) -> ObservationBuilder:
return self._builder
def observation_space(self) -> gym.Space:
space = []
for o in self._observations:
space.append(o.observation_space())
return gym.spaces.Tuple(space)
class CombinedObsForRailEnv(ObservationBuilder):
def __init__(self, builders: [ObservationBuilder]):
super().__init__()
self._builders = builders
def reset(self):
for b in self._builders:
b.reset()
def get(self, handle: int = 0):
return None
def get_many(self, handles: Optional[List[int]] = None):
obs = {h: [] for h in handles}
for b in self._builders:
sub_obs = b.get_many(handles)
for h in handles:
obs[h].append(sub_obs[h])
return obs
def set_env(self, env):
for b in self._builders:
b.set_env(env)
flatland-sparse-small-tree-and-conflict-fc-apex:
run: APEX
env: flatland_sparse
stop:
timesteps_total: 15000000 # 1e8
checkpoint_freq: 10
checkpoint_at_end: True
keep_checkpoints_num: 5
checkpoint_score_attr: episode_reward_mean
config:
num_workers: 15
num_envs_per_worker: 5
num_gpus: 0
env_config:
observation: combined
observation_config:
tree:
max_depth: 2
shortest_path_max_depth: 30
localConflict:
max_depth: 2
shortest_path_max_depth: 30
n_local: 5
generator: sparse_rail_generator
generator_config: small_v0
resolve_deadlocks: False
deadlock_reward: 0
density_reward_factor: 0
wandb:
project: flatland
entity: masterscrat
tags: ["small_v0", "tree_and_local_conflict", "apex"] # TODO should be set programmatically
model:
fcnet_activation: relu
fcnet_hiddens: [256, 256]
vf_share_layers: True # False
flatland-sparse-small-density-cnn-apex:
run: APEX
env: flatland_sparse
stop:
timesteps_total: 10000
checkpoint_freq: 10
checkpoint_at_end: True
keep_checkpoints_num: 5
checkpoint_score_attr: episode_reward_mean
config:
num_workers: 2
num_envs_per_worker: 5
num_gpus: 0
env_config:
observation: combined
observation_config:
tree:
max_depth: 2
shortest_path_max_depth: 30
localConflict:
max_depth: 2
shortest_path_max_depth: 30
n_local: 5
generator: sparse_rail_generator
generator_config: small_v0
resolve_deadlocks: False
deadlock_reward: 0
density_reward_factor: 0
wandb:
project: flatland
entity: masterscrat
tags: ["small_v0", "tree_and_local_conflict", "apex"] # TODO should be set programmatically
model:
fcnet_activation: relu
fcnet_hiddens: [256, 256]
vf_share_layers: True
......@@ -15,3 +15,8 @@ echo "===================="
echo "GLOBAL DENSITY OBS"
echo "===================="
time python ./train.py -f experiments/tests/global_density_obs_apex.yaml
echo "===================="
echo "COMBINED OBS"
echo "===================="
time python ./train.py -f experiments/tests/combined_obs_apex.yaml
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment