Merge branch 'master' into 188_refining_generator

# Conflicts: # examples/simple_example_city_railway_generator.py

Merge branch 'master' into 188_refining_generator
# Conflicts: # examples/simple_example_city_railway_generator.py
bd052de6 · Erik Nygren · 03b13140 · f3b309d4 · 03b13140 · 03b13140
Commit bd052de6 authored 5 years ago by Erik Nygren
--- a/docs/installation.rst
+++ b/docs/installation.rst
-.. highlight:: shell
-
-============
-Installation
-============
-
-Software Runtime & Dependencies
-------------------------------
-
-This is the recommended way of installation and running flatland's dependencies.
-
-* Install `Anaconda <https://www.anaconda.com/distribution/>`_ by following the instructions `here <https://www.anaconda.com/distribution/>`_
-* Create a new conda environment 
-
-.. code-block:: console
-
-    $ conda create python=3.6 --name flatland-rl
-    $ conda activate flatland-rl
-
-* Install the necessary dependencies
-
-.. code-block:: console
-
-    $ conda install -c conda-forge cairosvg pycairo
-    $ conda install -c anaconda tk  
-
-
-Stable release
--------------
-
-To install flatland, run this command in your terminal:
-
-.. code-block:: console
-
-    $ pip install flatland-rl
-
-This is the preferred method to install flatland, as it will always install the most recent stable release.
-
-If you don't have `pip`_ installed, this `Python installation guide`_ can guide
-you through the process.
-
-.. _pip: https://pip.pypa.io
-.. _Python installation guide: http://docs.python-guide.org/en/latest/starting/installation/
-
-
-From sources
------------
-
-The sources for flatland can be downloaded from the `Gitlab repo`_.
-
-You can clone the public repository:
-
-.. code-block:: console
-
-    $ git clone git@gitlab.aicrowd.com:flatland/flatland.git
-
-Once you have a copy of the source, you can install it with:
-
-.. code-block:: console
-
-    $ python setup.py install
-
-
-.. _Gitlab repo: https://gitlab.aicrowd.com/flatland/flatland
-
-
-Jupyter Canvas Widget
---------------------
-If you work with jupyter notebook you need to install the Jupyer Canvas Widget. To install the Jupyter Canvas Widget read also
-https://github.com/Who8MyLunch/Jupyter_Canvas_Widget#installation
--- a/docs/localevaluation.rst
+++ b/docs/localevaluation.rst
-================
-Local Evaluation
-================
-
-This document explains you how to locally evaluate your submissions before making 
-an official submission to the competition.
-
-Requirements
------------
-
-* **flatland-rl** : We expect that you have `flatland-rl` installed by following the instructions in  :doc:`installation`.
-
-* **redis** : Additionally you will also need to have  `redis installed <https://redis.io/topics/quickstart>`_ and **should have it running in the background.**
-
-Test Data
---------
-
-* **test env data** : You can `download and untar the test-env-data <https://www.aicrowd.com/challenges/flatland-challenge/dataset_files>`, at a location of your choice, lets say `/path/to/test-env-data/`. After untarring the folder, the folder structure should look something like:
-
-
-.. code-block:: console
-
-    .
-    └── test-env-data
-        ├── Test_0
-        │   ├── Level_0.pkl
-        │   └── Level_1.pkl
-        ├── Test_1
-        │   ├── Level_0.pkl
-        │   └── Level_1.pkl
-        ├..................
-        ├..................
-        ├── Test_8
-        │   ├── Level_0.pkl
-        │   └── Level_1.pkl
-        └── Test_9
-            ├── Level_0.pkl
-            └── Level_1.pkl
-
-Evaluation Service
------------------
-
-* **start evaluation service** : Then you can start the evaluator by running : 
-
-.. code-block:: console
-
-    flatland-evaluator --tests /path/to/test-env-data/
-
-RemoteClient
------------
-
-* **run client** : Some `sample submission code can be found in the starter-kit <https://github.com/AIcrowd/flatland-challenge-starter-kit/>`_, but before you can run your code locally using `FlatlandRemoteClient`, you will have to set the `AICROWD_TESTS_FOLDER` environment variable to the location where you previous untarred the folder with `the test-env-data`:
-
-
-.. code-block:: console
-
-    export AICROWD_TESTS_FOLDER="/path/to/test-env-data/"
-
-    # or on Windows :
-    # 
-    # set AICROWD_TESTS_FOLDER "\path\to\test-env-data\"
-
-    # and then finally run your code
-    python run.py
-
--- a/docs/FAQ.rst
+++ b/docs/FAQ.rst
--- a/docs/specifications/core.md
+++ b/docs/specifications/core.md
-# Core Specifications
-## Environment Class Overview
+## Core Specifications
+
+### Environment Class Overview

 The Environment class contains all necessary functions for the interactions between the agents and the environment. The base Environment class is derived from rllib.env.MultiAgentEnv (https://github.com/ray-project/ray).


--- a/docs/specifications/intro.md
+++ b/docs/specifications/intro.md
+## Intro
+
+In a human-readable language, specifications provide
+- code base overview (hand-drawn concept)
+- key concepts (generators, envs) and how are they linked
+- link relevant code base
+
+![Overview](img/UML_flatland.png)
+`Diagram Source <https://confluence.sbb.ch/x/pQfsSw>`_
--- a/docs/intro_observation_actions.rst
+++ b/docs/intro_observation_actions.rst
-=============================
+
 Observation and Action Spaces
-=============================
+----------------------------
 This is an introduction to the three standard observations and the action space of **Flatland**.

 Action Space
-============
+^^^^^^^^^^^^
 Flatland is a railway simulation. Thus the actions of an agent are strongly limited to the railway network. This means that in many cases not all actions are valid.
 The possible actions of an agent are

@@ -15,7 +15,7 @@ The possible actions of an agent are
 - ``4`` **Stop**: This action causes the agent to stop.

 Observation Spaces
-==================
+^^^^^^^^^^^^^^^^^^
 In the **Flatland** environment we have included three basic observations to get started. The figure below illustrates the observation range of the different basic observation: ``Global``, ``Local Grid`` and ``Local Tree``.

 .. image:: https://i.imgur.com/oo8EIYv.png
@@ -24,7 +24,7 @@ In the **Flatland** environment we have included three basic observations to get

   
 Global Observation
------------------
+~~~~~~~~~~~~~~~~~~
 Gives a global observation of the entire rail environment.

 The observation is composed of the following elements:
@@ -37,7 +37,7 @@ We encourage you to enhance this observation with any layer you think might help
 It would also be possible to construct a global observation for a super agent that controls all agents at once.

 Local Grid Observation
----------------------
+~~~~~~~~~~~~~~~~~~~~~~
 Gives a local observation of the rail environment around the agent.
 The observation is composed of the following elements:

@@ -50,7 +50,7 @@ Be aware that this observation **does not** contain any clues about target locat
 We encourage you to come up with creative ways to overcome this problem. In the tree observation below we introduce the concept of distance maps.

 Tree Observation
----------------
+~~~~~~~~~~~~~~~~
 The tree observation is built by exploiting the graph structure of the railway network. The observation is generated by spanning a **4 branched tree** from the current position of the agent. Each branch follows the allowed transitions (backward branch only allowed at dead-ends) until a cell with multiple allowed transitions is reached. Here the information gathered along the branch is stored as a node in the tree.
 The figure below illustrates how the tree observation is built:

@@ -73,7 +73,7 @@ The right side of the figure shows the resulting tree of the railway network on
    
    
 Node Information
----------------
+~~~~~~~~~~~~~~~~
 Each node is filled with information gathered along the path to the node. Currently each node contains 9 features:

 - 1: if own target lies on the explored branch the current distance from the agent in number of cells is stored.

--- a/docs/specifications/railway.md
+++ b/docs/specifications/railway.md
--- a/docs/specifications/rendering.md
+++ b/docs/specifications/rendering.md
-# Rendering Specifications
+## Rendering Specifications

-## Scope
+### Scope
 This doc specifies the software to meet the requirements in the Visualization requirements doc.

-## References
+### References
 - [Visualization Requirements](visualization)
 - [Core Spec](./core)

-## Interfaces
-### Interface with Environment Component
+### Interfaces
+#### Interface with Environment Component

 - Environment produces the Env Snapshot data structure (TBD)
 - Renderer reads the Env Snapshot
@@ -28,9 +28,9 @@ This doc specifies the software to meet the requirements in the Visualization re
    - Or, render frames without blocking environment
        - Render frames in separate process / thread

-#### Environment Snapshot
+##### Environment Snapshot

-### Data Structure
+#### Data Structure

 A definitions of the data structure is to be defined in Core requirements or Interfaces doc.

@@ -50,7 +50,7 @@ Top-level dictionary
        - Tree-based observation
            - TBD

-### Existing Tools / Libraries
+#### Existing Tools / Libraries
 1. Pygame
    1. Very easy to use. Like dead simple to add sprites etc. [Link](https://studywolf.wordpress.com/2015/03/06/arm-visualization-with pygame/)
    2. No inbuilt support for threads/processes. Does get faster if using pypy/pysco.
@@ -58,18 +58,18 @@ Top-level dictionary
    1. Somewhat simple, a little more verbose to use the different modules.
    2. Multi-threaded via QThread! Yay! (Doesn’t block main thread that does the real work), [Link](https://nikolak.com/pyqt-threading-tutorial/)

-#### How to structure the code
+##### How to structure the code

 1. Define draw functions/classes for each primitive
    1. Primitives: Agents (Trains), Railroad, Grass, Houses etc.
 2. Background. Initialize the background before starting the episode.
    1. Static objects in the scenes, directly draw those primitives once and cache.

-#### Proposed Interfaces
+##### Proposed Interfaces
 To-be-filled

-### Technical Graphics Considerations
+#### Technical Graphics Considerations

-#### Overlay dynamic primitives over the background at each time step.
+##### Overlay dynamic primitives over the background at each time step.

 No point trying to figure out changes. Need to explicitly draw every primitive anyways (that’s how these renders work).
--- a/docs/specifications/visualization.md
+++ b/docs/specifications/visualization.md
-# Visualization
+## Visualization

 ![logo](https://drive.google.com/uc?export=view&id=1rstqMPJXFJd9iD46z1A5Rus-W0Ww6O8i)


-# Introduction & Scope
+### Introduction & Scope

 Broad requirements for human-viewable display of a single Flatland Environment.


-## Context
+#### Context

 Shows this software component in relation to some of the other components.  We name the component the "Renderer".  Multiple agents interact with a single Environment.  A renderer interacts with the environment, and displays on screen, and/or into movie or image files.

@@ -20,10 +20,10 @@ Shows this software component in relation to some of the other components.  We n
 ![drawing](https://docs.google.com/a/google.com/drawings/d/12345/export/png)


-# Requirements
+### Requirements


-## Primary Requirements
+#### Primary Requirements



@@ -39,7 +39,7 @@ Shows this software component in relation to some of the other components.  We n
    7. Should not drive the "main loop" of Inference or training 


-## Secondary / Optional Requirements 
+#### Secondary / Optional Requirements 



@@ -68,7 +68,7 @@ Shows this software component in relation to some of the other components.  We n
    15. Browser


-## Performance Metrics
+#### Performance Metrics

 Here are some performance metrics which the Renderer should meet.

@@ -78,7 +78,7 @@ Here are some performance metrics which the Renderer should meet.
   <td>
   </td>
   <td><p style="text-align: right">
-# Per second</p>
+### Per second</p>

   </td>
   <td><p style="text-align: right">
@@ -144,15 +144,15 @@ Prototype time (ms)</p>



-## Example Visualization
+#### Example Visualization


-# Reference Documents
+### Reference Documents

 Link to this doc: https://docs.google.com/document/d/1Y4Mw0Q6r8PEOvuOZMbxQX-pV2QKDuwbZJBvn18mo9UU/edit#


-## Core Specification
+#### Core Specification

 This specifies the system containing the environment and agents - this will be able to run independently of the renderer.

@@ -161,24 +161,24 @@ This specifies the system containing the environment and agents - this will be a
 The data structure which the renderer needs to read initially resides here.


-## Visualization Specification
+#### Visualization Specification

 This will specify the software which will meet the requirements documented here.

 [https://docs.google.com/document/d/1XYOe_aUIpl1h_RdHnreACvevwNHAZWT0XHDL0HsfzRY/edit#](https://docs.google.com/document/d/1XYOe_aUIpl1h_RdHnreACvevwNHAZWT0XHDL0HsfzRY/edit#)


-## Interface Specification
+#### Interface Specification

 This will specify the interfaces through which the different components communicate


-# Non-requirements - to be deleted below here.
+### Non-requirements - to be deleted below here.

 The below has been copied into the spec doc.    Comments may be lost.  I'm only preserving it to save the comments for a few days - they don't cut & paste into the other doc!


-## Interface with Environment Component
+#### Interface with Environment Component



@@ -201,7 +201,7 @@ The below has been copied into the spec doc.    Comments may be lost.  I'm only
        *   Render frames in separate process / thread


-#### Environment Snapshot
+###### Environment Snapshot

 **Data Structure**

@@ -227,7 +227,7 @@ Top-level dictionary
            *   TBD


-## Investigation into Existing Tools / Libraries
+#### Investigation into Existing Tools / Libraries



@@ -252,9 +252,9 @@ Top-level dictionary
 To-be-filled


-## Technical Graphics Considerations
+#### Technical Graphics Considerations


-#### Overlay dynamic primitives over the background at each time step.
+###### Overlay dynamic primitives over the background at each time step.

 No point trying to figure out changes. Need to explicitly draw every primitive anyways (that's how these renders work).
--- a/docs/specifications_index.rst
+++ b/docs/specifications_index.rst
-Flatland Specs
-==============
-
-.. toctree::
-   :maxdepth: 2
-
-   specifications/specifications.md
-   specifications/core.md
-   specifications/railway.md
-   specifications/rendering.md
-   specifications/specifications.md
-   specifications/visualization.md
--- a/docs/gettingstarted.rst
+++ b/docs/gettingstarted.rst
-===============
-Getting Started
-===============
+Getting Started Tutorial
+========================

 Overview
 --------
@@ -16,9 +15,8 @@ To use flatland in a project:
    import flatland


-Part 1 : Basic Usage
--------------------
-
+Simple Example 1 : Basic Usage
+------------------------------
 The basic usage of RailEnv environments consists in creating a RailEnv object
 endowed with a rail generator, that generates new rail networks on each reset,
 and an observation generator object, that is supplied with environment-specific
@@ -120,7 +118,8 @@ The complete code for this part of the Getting Started guide can be found in


 Part 2 : Training a Simple an Agent on Flatland
-----------------------------------------------
+---------------------------------------------------------
+
 This is a brief tutorial on how to train an agent on Flatland.
 Here we use a simple random agent to illustrate the process on how to interact with the environment.
 The corresponding code can be found in examples/training_example.py and in the baselines repository
@@ -187,77 +186,4 @@ This dictionary is then passed to the environment which checks the validity of a
 The environment returns an array of new observations, reward dictionary for all agents as well as a flag for which agents are done.
 This information can be used to update the policy of your agent and if done['__all__'] == True the episode terminates.

-Part 3 : Customizing Observations and Level Generators
------------------------------------------------------
-
-Example code for generating custom observations given a RailEnv and to generate
-random rail maps are available in examples/custom_observation_example.py and
-examples/custom_railmap_example.py .
-
-Custom observations can be produced by deriving a new object from the
-core.env_observation_builder.ObservationBuilder base class, for example as follows:
-
-.. code-block:: python
-
-    class CustomObs(ObservationBuilder):
-        def __init__(self):
-            self.observation_space = [5]
-
-        def reset(self):
-            return
-
-        def get(self, handle):
-            observation = handle*np.ones((5,))
-            return observation
-
-It is important that an observation_space is defined with a list of dimensions
-of the returned observation tensors. get() returns the observation for each agent,
-of handle 'handle'.
-
-A RailEnv environment can then be created as usual:
-
-.. code-block:: python
-
-    env = RailEnv(width=7,
-                  height=7,
-                  rail_generator=random_rail_generator(),
-                  number_of_agents=3,
-                  obs_builder_object=CustomObs())
-
-As for generating custom rail maps, the RailEnv class accepts a rail_generator
-argument that must be a function with arguments `width`, `height`, `num_agents`,
-and `num_resets=0`, and that has to return a GridTransitionMap object (the rail map),
-and three lists of tuples containing the (row,column) coordinates of each of
-num_agent agents, their initial orientation **(0=North, 1=East, 2=South, 3=West)**,
-and the position of their targets.
-
-For example, the following custom rail map generator returns an empty map of
-size (height, width), with no agents (regardless of num_agents):
-
-.. code-block:: python
-
-    def custom_rail_generator():
-        def generator(width, height, num_agents=0, num_resets=0):
-            rail_trans = RailEnvTransitions()
-            grid_map = GridTransitionMap(width=width, height=height, transitions=rail_trans)
-            rail_array = grid_map.grid
-            rail_array.fill(0)
-
-            agents_positions = []
-            agents_direction = []
-            agents_target = []
-
-            return grid_map, agents_positions, agents_direction, agents_target
-        return generator
-
-It is worth to note that helpful utilities to manage RailEnv environments and their
-related data structures are available in 'envs.env_utils'. In particular,
-envs.env_utils.get_rnd_agents_pos_tgt_dir_on_rail is fairly handy to fill in
-random (but consistent) agents along with their targets and initial directions,
-given a rail map (GridTransitionMap object) and the desired number of agents:
-
-.. code-block:: python
-
-    agents_position, agents_direction, agents_target = get_rnd_agents_pos_tgt_dir_on_rail(
-        rail_map,
-        num_agents)
+The full source code of this example can be found in `examples/training_example.py <https://gitlab.aicrowd.com/flatland/flatland/blob/master/examples/training_example.py>`_.
--- a/docs/intro_observationbuilder.rst
+++ b/docs/intro_observationbuilder.rst
-==============================================================
-Getting Started with custom observations and custom predictors
-==============================================================
+Custom observations and custom predictors Tutorial
+==================================================

 Overview
 --------

--- a/docs/tutorials/03_rail_and_schedule_generator.md
+++ b/docs/tutorials/03_rail_and_schedule_generator.md
+# Level Generation Tutorial
+
+We are currently working on different new level generators and you can expect that the levels in the submission testing will not all come from just one but rather different level generators to be sure that the controllers can handle any railway specific challenge.
+
+Let's have a look at the `sparse_rail_generator`.
+
+## Sparse Rail Generator
+![Example_Sparse](https://i.imgur.com/DP8sIyx.png)
+
+The idea behind the sparse rail generator is to mimic classic railway structures where dense nodes (cities) are sparsely connected to each other and where you have to manage traffic flow between the nodes efficiently. 
+The cities in this level generator are much simplified in comparison to real city networks but it mimics parts of the problems faced in daily operations of any railway company.
+
+There are a few parameters you can tune to build your own map and test different complexity levels of the levels. 
+**Warning** some combinations of parameters do not go well together and will lead to infeasible level generation. 
+In the worst case, the level generator currently issues a warning when it cannot build the environment according to the parameters provided. 
+This will lead to a crash of the whole env. 
+We are currently working on improvements here and are **happy for any suggestions from your side**.
+
+To build an environment you instantiate a `RailEnv` as follows:
+
+```python
+ Initialize the generator
+rail_generator=sparse_rail_generator(
+    num_cities=10,  # Number of cities in map
+    num_intersections=10,  # Number of interesections in map
+    num_trainstations=50,  # Number of possible start/targets on map
+    min_node_dist=6,  # Minimal distance of nodes
+    node_radius=3,  # Proximity of stations to city center
+    num_neighb=3,  # Number of connections to other cities
+    seed=5,  # Random seed
+    grid_mode=False  # Ordered distribution of nodes
+)
+
+ Build the environment
+env = RailEnv(
+    width=50,
+    height=50,
+    rail_generator=rail_generator
+    schedule_generator=sparse_schedule_generator(),
+    number_of_agents=10,
+    obs_builder_object=TreeObsForRailEnv(max_depth=3,predictor=shortest_path_predictor)
+)
+```
+
+You can see that you now need both a `rail_generator` and a `schedule_generator` to generate a level. These need to work nicely together. The `rail_generator` will only generate the railway infrastructure and provide hints to the `schedule_generator` about where to place agents. The `schedule_generator` will then generate a schedule, meaning it places agents at different train stations and gives them tasks by providing individual targets.
+
+You can tune the following parameters in the `sparse_rail_generator`:
+
+- `num_cities` is the number of cities on a map. Cities are the only nodes that can host start and end points for agent tasks (Train stations). Here you have to be carefull that the number is not too high as all the cities have to fit on the map. When `grid_mode=False` you have to be carefull when chosing `min_node_dist` because leves will fails if not all cities (and intersections) can be placed with at least `min_node_dist` between them.
+- `num_intersections` is the number of nodes that don't hold any trainstations. They are also the first priority that a city connects to. We use these to allow for sparse connections between cities.
+- `num_trainstations` defines the *Total* number of trainstations in the network. This also sets the max number of allowed agents in the environment. This is also a delicate parameter as there is only a limitid amount of space available around nodes and thus if the number is too high the level generation will fail. *Important*: Only the number of agents provided to the environment will actually produce active train stations. The others will just be present as dead-ends (See figures below).
+- `min_node_dist` is only used if `grid_mode=False` and represents the minimal distance between two nodes.
+- `node_radius` defines the extent of a city. Each trainstation is placed at a distance to the closes city node that is smaller or equal to this number.
+- `num_neighb`defines the number of neighbouring nodes that connect to each other. Thus this changes the connectivity and thus the amount of alternative routes in the network.
+- `grid_mode` True -> Nodes evenly distriubted in env, False-> Random distribution of nodes
+- `enhance_intersection`: True -> Extra rail elements added at intersections
+- `seed` is used to initialize the random generator
+
+
+If you run into any bugs with sets of parameters please let us know.
+
+Here is a network with `grid_mode=False` and the parameters from above.
+
+![sparse_random](https://i.imgur.com/Xg7nifF.png)
+
+and here with `grid_mode=True`
+
+![sparse_ordered](https://i.imgur.com/jyA7Pt4.png)
+
+## Example code
+
+To see all the changes in action you can just run the `flatland_example_2_0.py` file in the examples folder. The file can be found [here](https://gitlab.aicrowd.com/flatland/flatland/blob/master/examples/flatland_2_0_example.py).
--- a/docs/tutorials/04_stochasticity.md
+++ b/docs/tutorials/04_stochasticity.md
+# Stochasticity Tutorial
+
+Another area where we improved **Flat**land 2.0 are stochastic events added during the episodes. 
+This is very common for railway networks where the initial plan usually needs to be rescheduled during operations as minor events such as delayed departure from trainstations, malfunctions on trains or infrastructure or just the weather lead to delayed trains.
+
+We implemted a poisson process to simulate delays by stopping agents at random times for random durations. The parameters necessary for the stochastic events can be provided when creating the environment.
+
+```python
+# Use a the malfunction generator to break agents from time to time
+
+stochastic_data = {
+    'prop_malfunction': 0.5,  # Percentage of defective agents
+    'malfunction_rate': 30,  # Rate of malfunction occurence
+    'min_duration': 3,  # Minimal duration of malfunction
+    'max_duration': 10  # Max duration of malfunction
+}
+```
+
+The parameters are as follows:
+
+- `prop_malfunction` is the proportion of agents that can malfunction. `1.0` means that each agent can break.
+- `malfunction_rate` is the mean rate of the poisson process in number of environment steps.
+- `min_duration` and `max_duration` set the range of malfunction durations. They are sampled uniformly
+
+You can introduce stochasticity by simply creating the env as follows:
+
+```python
+env = RailEnv(
+    ...
+    stochastic_data=stochastic_data,  # Malfunction data generator
+    ...    
+)
+```
+In your controller, you can check whether an agent is malfunctioning: 
+```python
+obs, rew, done, info = env.step(actions) 
+...
+action_dict = dict()
+for a in range(env.get_num_agents()):
+    if info['malfunction'][a] == 0:
+        action_dict.update({a: ...})
+
+# Custom observation builder
+tree_observation = TreeObsForRailEnv(max_depth=2, predictor=ShortestPathPredictorForRailEnv())
+
+# Different agent types (trains) with different speeds.
+speed_ration_map = {1.: 0.25,  # Fast passenger train
+                    1. / 2.: 0.25,  # Fast freight train
+                    1. / 3.: 0.25,  # Slow commuter train
+                    1. / 4.: 0.25}  # Slow freight train
+
+env = RailEnv(width=50,
+              height=50,
+              rail_generator=sparse_rail_generator(num_cities=20,  # Number of cities in map (where train stations are)
+                                                   num_intersections=5,  # Number of intersections (no start / target)
+                                                   num_trainstations=15,  # Number of possible start/targets on map
+                                                   min_node_dist=3,  # Minimal distance of nodes
+                                                   node_radius=2,  # Proximity of stations to city center
+                                                   num_neighb=4,  # Number of connections to other cities/intersections
+                                                   seed=15,  # Random seed
+                                                   grid_mode=True,
+                                                   enhance_intersection=True
+                                                   ),
+              schedule_generator=sparse_schedule_generator(speed_ration_map),
+              number_of_agents=10,
+              stochastic_data=stochastic_data,  # Malfunction data generator
+              obs_builder_object=tree_observation)
+```
+
+You will quickly realize that this will lead to unforeseen difficulties which means that **your controller** needs to observe the environment at all times to be able to react to the stochastic events.
+
+## Example code
+
+To see all the changes in action you can just run the `flatland_example_2_0.py` file in the examples folder. The file can be found [here](https://gitlab.aicrowd.com/flatland/flatland/blob/master/examples/flatland_2_0_example.py).
--- a/docs/specifications/specifications.md
+++ b/docs/specifications/specifications.md
-Flatland Specs
-==========================
+# Different speed profiles Tutorial

-What are **Flatland** specs about?
---------------------------------
-In a humand-readable language, they provide
-* code base overview (hand-drawn concept)
-* key concepts (generators, envs) and how are they linked
-* link relevant code base
+One of the main contributions to the complexity of railway network operations stems from the fact that all trains travel at different speeds while sharing a very limited railway network. 
+In **Flat**land 2.0 this feature will be enabled as well and will lead to much more complex configurations. Here we count on your support if you find bugs or improvements  :).
+
+The different speed profiles can be generated using the `schedule_generator`, where you can actually chose as many different speeds as you like. 
+Keep in mind that the *fastest speed* is 1 and all slower speeds must be between 1 and 0. 
+For the submission scoring you can assume that there will be no more than 5 speed profiles.
+
+
+ 
+Later versions of **Flat**land might have varying speeds during episodes. Therefore, we return the agent speeds. 
+Notice that we do not guarantee that the speed will be computed at each step, but if not costly we will return it at each step.
+In your controller, you can get the agents' speed from the `info` returned by `step`: 
+```python
+obs, rew, done, info = env.step(actions) 
+...
+for a in range(env.get_num_agents()):
+    speed = info['speed'][a]
+```
+
+## Actions and observation with different speed levels
+
+Because the different speeds are implemented as fractions the agents ability to perform actions has been updated. 
+We **do not allow actions to change within the cell **. 
+This means that each agent can only chose an action to be taken when entering a cell. 
+This action is then executed when a step to the next cell is valid. For example

-## Overview
-![UML_flatland.png](img/UML_flatland.png)
-[Diagram Source](https://confluence.sbb.ch/x/pQfsSw)
+- Agent enters switch and choses to deviate left. Agent fractional speed is 1/4 and thus the agent will take 4 time steps to complete its journey through the cell. On the 4th time step the agent will leave the cell deviating left as chosen at the entry of the cell.
+    - All actions chosen by the agent during its travels within a cell are ignored
+    - Agents can make observations at any time step. Make sure to discard observations without any information. See this [example](https://gitlab.aicrowd.com/flatland/baselines/blob/master/torch_training/training_navigation.py) for a simple implementation.
+- The environment checks if agent is allowed to move to next cell only at the time of the switch to the next cell

+In your controller, you can check whether an agent requires an action by checking `info`: 
+```python
+obs, rew, done, info = env.step(actions) 
+...
+action_dict = dict()
+for a in range(env.get_num_agents()):
+    if info['action_required'][a] and info['malfunction'][a] == 0:
+        action_dict.update({a: ...})

+```
+Notice that `info['action_required'][a]` does not mean that the action will have an effect: 
+if the next cell is blocked or the agent breaks down, the action cannot be performed and an action will be required again in the next step. 

 ## Rail Generators and Schedule Generators
 The separation between rail generator and schedule generator reflects the organisational separation in the railway domain
@@ -22,7 +52,7 @@ Usually, there is a third organisation, which ensures discrimination-free access
 However, in the **Flat**land challenge, we focus on the re-scheduling problem during live operations.

 Technically, 
-``` 
+```python
 RailGeneratorProduct = Tuple[GridTransitionMap, Optional[Any]]
 RailGenerator = Callable[[int, int, int, int], RailGeneratorProduct]

@@ -32,7 +62,7 @@ ScheduleGenerator = Callable[[GridTransitionMap, int, Optional[Any]], ScheduleGe
 ```

 We can then produce `RailGenerator`s by currying:
-```
+```python
 def sparse_rail_generator(num_cities=5, num_intersections=4, num_trainstations=2, min_node_dist=20, node_radius=2,
                          num_neighb=3, grid_mode=False, enhance_intersection=False, seed=0):

@@ -50,7 +80,7 @@ def sparse_rail_generator(num_cities=5, num_intersections=4, num_trainstations=2
    return generator
 ```
 And, similarly, `ScheduleGenerator`s:
-```
+```python
 def sparse_schedule_generator(speed_ratio_map: Mapping[float, float] = None) -> ScheduleGenerator:
    def generator(rail: GridTransitionMap, num_agents: int, hints: Any = None):
        # place agents:
@@ -69,7 +99,7 @@ For instance, the way the `sparse_rail_generator` generates the grid, it already
 Hence, `rail_generator` and `schedule_generator` have to match if `schedule_generator` presupposes some specific `agents_hints`.

 The environment's `reset` takes care of applying the two generators:
-```
+```python
    def __init__(self,
            ...
             rail_generator: RailGenerator = random_rail_generator(),
@@ -93,240 +123,6 @@ The environment's `reset` takes care of applying the two generators:
 ```


-## RailEnv Speeds
-One of the main contributions to the complexity of railway network operations stems from the fact that all trains travel at different speeds while sharing a very limited railway network. 
-
-The different speed profiles can be generated using the `schedule_generator`, where you can actually chose as many different speeds as you like. 
-Keep in mind that the *fastest speed* is 1 and all slower speeds must be between 1 and 0. 
-For the submission scoring you can assume that there will be no more than 5 speed profiles.
-
-
-Currently (as of **Flat**land 2.0), an agent keeps its speed over the whole episode. 
-
-Because the different speeds are implemented as fractions the agents ability to perform actions has been updated. 
-We **do not allow actions to change within the cell **. 
-This means that each agent can only chose an action to be taken when entering a cell. 
-This action is then executed when a step to the next cell is valid. For example
-
- Agent enters switch and choses to deviate left. Agent fractional speed is 1/4 and thus the agent will take 4 time steps to complete its journey through the cell. On the 4th time step the agent will leave the cell deviating left as chosen at the entry of the cell.
-    - All actions chosen by the agent during its travels within a cell are ignored
-    - Agents can make observations at any time step. Make sure to discard observations without any information. See this [example](https://gitlab.aicrowd.com/flatland/baselines/blob/master/torch_training/training_navigation.py) for a simple implementation.
- The environment checks if agent is allowed to move to next cell only at the time of the switch to the next cell
-
-In your controller, you can check whether an agent requires an action by checking `info`: 
-```
-obs, rew, done, info = env.step(actions) 
-...
-action_dict = dict()
-for a in range(env.get_num_agents()):
-    if info['action_required'][a]:
-        action_dict.update({a: ...})
-
-```
-Notice that `info['action_required'][a]` 
-* if the agent breaks down (see stochasticity below) on entering the cell (no distance elpased in the cell), an action required as long as the agent is broken down;
-when it gets back to work, the action chosen just before will be taken and executed at the end of the cell; you may check whether the agent
-gets healthy again in the next step by checking `info['malfunction'][a] == 1`.
-* when the agent has spent enough time in the cell, the next cell may not be free and the agent has to wait. 
-
+## Example code

-Since later versions of **Flat**land might have varying speeds during episodes. 
-Therefore, we return the agents' speed - in your controller, you can get the agents' speed from the `info` returned by `step`: 
-```
-obs, rew, done, info = env.step(actions) 
-...
-for a in range(env.get_num_agents()):
-    speed = info['speed'][a]
-```
-Notice that we do not guarantee that the speed will be computed at each step, but if not costly we will return it at each step.
-
-
-
-
-
-
-
-
-
-## RailEnv Malfunctioning / Stochasticity
-
-Stochastic events may happen during the episodes. 
-This is very common for railway networks where the initial plan usually needs to be rescheduled during operations as minor events such as delayed departure from trainstations, malfunctions on trains or infrastructure or just the weather lead to delayed trains.
-
-We implemted a poisson process to simulate delays by stopping agents at random times for random durations. The parameters necessary for the stochastic events can be provided when creating the environment.
-
-```
-# Use a the malfunction generator to break agents from time to time
-
-stochastic_data = {
-    'prop_malfunction': 0.5,  # Percentage of defective agents
-    'malfunction_rate': 30,  # Rate of malfunction occurence
-    'min_duration': 3,  # Minimal duration of malfunction
-    'max_duration': 10  # Max duration of malfunction
-}
-```
-
-The parameters are as follows:
-
- `prop_malfunction` is the proportion of agents that can malfunction. `1.0` means that each agent can break.
- `malfunction_rate` is the mean rate of the poisson process in number of environment steps.
- `min_duration` and `max_duration` set the range of malfunction durations. They are sampled uniformly
-
-You can introduce stochasticity by simply creating the env as follows:
-
-```
-env = RailEnv(
-    ...
-    stochastic_data=stochastic_data,  # Malfunction data generator
-    ...    
-)
-```
-In your controller, you can check whether an agent is malfunctioning: 
-```
-obs, rew, done, info = env.step(actions) 
-...
-action_dict = dict()
-for a in range(env.get_num_agents()):
-    if info['malfunction'][a] == 0:
-        action_dict.update({a: ...})
-
-# Custom observation builder
-tree_observation = TreeObsForRailEnv(max_depth=2, predictor=ShortestPathPredictorForRailEnv())
-
-# Different agent types (trains) with different speeds.
-speed_ration_map = {1.: 0.25,  # Fast passenger train
-                    1. / 2.: 0.25,  # Fast freight train
-                    1. / 3.: 0.25,  # Slow commuter train
-                    1. / 4.: 0.25}  # Slow freight train
-
-env = RailEnv(width=50,
-              height=50,
-              rail_generator=sparse_rail_generator(num_cities=20,  # Number of cities in map (where train stations are)
-                                                   num_intersections=5,  # Number of intersections (no start / target)
-                                                   num_trainstations=15,  # Number of possible start/targets on map
-                                                   min_node_dist=3,  # Minimal distance of nodes
-                                                   node_radius=2,  # Proximity of stations to city center
-                                                   num_neighb=4,  # Number of connections to other cities/intersections
-                                                   seed=15,  # Random seed
-                                                   grid_mode=True,
-                                                   enhance_intersection=True
-                                                   ),
-              schedule_generator=sparse_schedule_generator(speed_ration_map),
-              number_of_agents=10,
-              stochastic_data=stochastic_data,  # Malfunction data generator
-              obs_builder_object=tree_observation)
-```
-
-
-## Observation Builders
-Every `RailEnv` has an `obs_builder`. The `obs_builder` has full access to the `RailEnv`. 
-The `obs_builder` is called in the `step()` function to produce the observations.
-
-```
-env = RailEnv(
-    ...
-    obs_builder_object=TreeObsForRailEnv(
-        max_depth=2,
-       predictor=ShortestPathPredictorForRailEnv(max_depth=10)
-    ),
-    ...                   
-)
-```
-
-The two principal observation builders provided are global and tree.
-
-### Global Observation Builder
-`GlobalObsForRailEnv` gives a global observation of the entire rail environment.
-* transition map array with dimensions (env.height, env.width, 16),
-          assuming 16 bits encoding of transitions.
-
-* Two 2D arrays (map_height, map_width, 2) containing respectively the position of the given agent
-         target and the positions of the other agents targets.
-
-* A 3D array (map_height, map_width, 4) wtih
-            - first channel containing the agents position and direction
-            - second channel containing the other agents positions and diretions
-            - third channel containing agent malfunctions
-            - fourth channel containing agent fractional speeds
-            
-### Tree Observation Builder
-`TreeObsForRailEnv` computes the current observation for each agent.
-
-The observation vector is composed of 4 sequential parts, corresponding to data from the up to 4 possible
-movements in a `RailEnv` (up to because only a subset of possible transitions are allowed in RailEnv).
-The possible movements are sorted relative to the current orientation of the agent, rather than NESW as for
-the transitions. The order is:
-
-    [data from 'left'] + [data from 'forward'] + [data from 'right'] + [data from 'back']
-
-Each branch data is organized as:
-
-    [root node information] +
-    [recursive branch data from 'left'] +
-    [... from 'forward'] +
-    [... from 'right] +
-    [... from 'back']
-
-Each node information is composed of 9 features:
-
-1. if own target lies on the explored branch the current distance from the agent in number of cells is stored.
-
-2. if another agents target is detected the distance in number of cells from the agents current location
-    is stored
-
-3. if another agent is detected the distance in number of cells from current agent position is stored.
-
-4. possible conflict detected
-    tot_dist = Other agent predicts to pass along this cell at the same time as the agent, we store the
-     distance in number of cells from current agent position
-
-    0 = No other agent reserve the same cell at similar time
-
-5. if an not usable switch (for agent) is detected we store the distance.
-
-6. This feature stores the distance in number of cells to the next branching  (current node)
-
-7. minimum distance from node to the agent's target given the direction of the agent if this path is chosen
-
-8. agent in the same direction
-    n = number of agents present same direction
-        (possible future use: number of other agents in the same direction in this branch)
-    0 = no agent present same direction
-
-9. agent in the opposite direction
-    n = number of agents present other direction than myself (so conflict)
-        (possible future use: number of other agents in other direction in this branch, ie. number of conflicts)
-    0 = no agent present other direction than myself
-
-10. malfunctioning/blokcing agents
-    n = number of time steps the oberved agent remains blocked
-
-11. slowest observed speed of an agent in same direction
-    1 if no agent is observed
-
-    min_fractional speed otherwise
-
-Missing/padding nodes are filled in with -inf (truncated).
-Missing values in present node are filled in with +inf (truncated).
-
-
-In case of the root node, the values are [0, 0, 0, 0, distance from agent to target, own malfunction, own speed]
-In case the target node is reached, the values are [0, 0, 0, 0, 0].
-
-
-## Predictors
-Predictors make predictions on future agents' moves based on the current state of the environment.
-They are decoupled from observation builders in order to be encapsulate the functionality and to make it re-usable.
-
-For instance, `TreeObsForRailEnv` optionally uses the predicted the predicted trajectories while exploring
-the branches of an agent's future moves to detect future conflicts.
-
-The general call structure is as follows:
-```
-RailEnv.step() 
-               -> ObservationBuilder.get_many() 
-                                                ->  self.predictor.get()
-                                                    self.get()
-                                                    self.get()
-                                                    ...
-```
+To see all the changes in action you can just run the `flatland_example_2_0.py` file in the examples folder. The file can be found [here](https://gitlab.aicrowd.com/flatland/flatland/blob/master/examples/flatland_2_0_example.py).
--- a/env_data/tests/test_001.pkl
+++ b/env_data/tests/test_001.pkl
--- a/env_data/tests/test_002.pkl
+++ b/env_data/tests/test_002.pkl
--- a/examples/custom_observation_example_01_SimpleObs.py
+++ b/examples/custom_observation_example_01_SimpleObs.py
@@ -23,7 +23,7 @@ class SimpleObs(ObservationBuilder):
    def reset(self):
        return

-    def get(self, handle: int = 0):
+    def get(self, handle: int = 0) -> np.ndarray:
        observation = handle * np.ones((5,))
        return observation


--- a/examples/custom_observation_example_02_SingleAgentNavigationObs.py
+++ b/examples/custom_observation_example_02_SingleAgentNavigationObs.py
@@ -2,11 +2,12 @@ import getopt
 import random
 import sys
 import time
+from typing import List

 import numpy as np

+from flatland.core.env_observation_builder import ObservationBuilder
 from flatland.core.grid.grid4_utils import get_new_position
-from flatland.envs.observations import TreeObsForRailEnv
 from flatland.envs.rail_env import RailEnv
 from flatland.envs.rail_generators import complex_rail_generator
 from flatland.envs.schedule_generators import complex_schedule_generator
@@ -17,26 +18,22 @@ random.seed(100)
 np.random.seed(100)


-class SingleAgentNavigationObs(TreeObsForRailEnv):
+class SingleAgentNavigationObs(ObservationBuilder):
    """
-    We derive our bbservation builder from TreeObsForRailEnv, to exploit the existing implementation to compute
-    the minimum distances from each grid node to each agent's target.
-
-    We then build a representation vector with 3 binary components, indicating which of the 3 available directions
+    We build a representation vector with 3 binary components, indicating which of the 3 available directions
    for each agent (Left, Forward, Right) lead to the shortest path to its target.
    E.g., if taking the Left branch (if available) is the shortest route to the agent's target, the observation vector
    will be [1, 0, 0].
    """

    def __init__(self):
-        super().__init__(max_depth=0)
+        super().__init__()
        self.observation_space = [3]

    def reset(self):
-        # Recompute the distance map, if the environment has changed.
-        super().reset()
+        pass

-    def get(self, handle: int = 0):
+    def get(self, handle: int = 0) -> List[int]:
        agent = self.env.agents[handle]

        possible_transitions = self.env.rail.get_transitions(*agent.position, agent.direction)

--- a/examples/custom_observation_example_03_ObservePredictions.py
+++ b/examples/custom_observation_example_03_ObservePredictions.py
@@ -2,12 +2,13 @@ import getopt
 import random
 import sys
 import time
-from typing import Optional, List
+from typing import Optional, List, Dict

 import numpy as np

+from flatland.core.env import Environment
+from flatland.core.env_observation_builder import ObservationBuilder
 from flatland.core.grid.grid_utils import coordinate_to_position
-from flatland.envs.observations import TreeObsForRailEnv
 from flatland.envs.predictions import ShortestPathPredictorForRailEnv
 from flatland.envs.rail_env import RailEnv
 from flatland.envs.rail_generators import complex_rail_generator
@@ -20,28 +21,20 @@ random.seed(100)
 np.random.seed(100)


-class ObservePredictions(TreeObsForRailEnv):
+class ObservePredictions(ObservationBuilder):
    """
    We use the provided ShortestPathPredictor to illustrate the usage of predictors in your custom observation.
-
-    We derive our observation builder from TreeObsForRailEnv, to exploit the existing implementation to compute
-    the minimum distances from each grid node to each agent's target.
-
-    This is necessary so that we can pass the distance map to the ShortestPathPredictor
-
-    Here we also want to highlight how you can visualize your observation
    """

    def __init__(self, predictor):
-        super().__init__(max_depth=0)
+        super().__init__()
        self.observation_space = [10]
        self.predictor = predictor

    def reset(self):
-        # Recompute the distance map, if the environment has changed.
-        super().reset()
+        pass

-    def get_many(self, handles: Optional[List[int]] = None):
+    def get_many(self, handles: Optional[List[int]] = None) -> Dict[int, np.ndarray]:
        '''
        Because we do not want to call the predictor seperately for every agent we implement the get_many function
        Here we can call the predictor just ones for all the agents and use the predictions to generate our observations
@@ -69,7 +62,7 @@ class ObservePredictions(TreeObsForRailEnv):
            observations[h] = self.get(h)
        return observations

-    def get(self, handle: int = 0):
+    def get(self, handle: int = 0) -> np.ndarray:
        '''
        Lets write a simple observation which just indicates whether or not the own predicted path
        overlaps with other predicted paths at any time. This is useless for the task of navigation but might
@@ -106,6 +99,11 @@ class ObservePredictions(TreeObsForRailEnv):

        return observation

+    def set_env(self, env: Environment):
+        super().set_env(env)
+        if self.predictor:
+            self.predictor.set_env(self.env)
+

 def main(args):
    try: