Merge branch 'patch-2' into 'master'

Update intro_observationbuilder.rst See merge request flatland/flatland!145

Merge branch 'patch-2' into 'master'
Update intro_observationbuilder.rst See merge request flatland/flatland!145
462a7670 · Erik Nygren · f636be2d · 5fdd4f79 · 462a7670
Commit 462a7670 authored 5 years ago by Erik Nygren
--- a/docs/intro_observationbuilder.rst
+++ b/docs/intro_observationbuilder.rst
@@ -12,7 +12,7 @@ Whenever an environment needs to compute new observations for each agent, it que
 .. _Flatland-Challenge: https://www.aicrowd.com/challenges/flatland-challenge
 Example 1 : Simple (but useless) observation
--------------
+--------------------------------------------------------
 In this first example we implement all the functions necessary for the observation builder to be valid and work with **Flatland**.
 Custom observation builder objects need to derive from the `flatland.core.env_observation_builder.ObservationBuilder`_
 base class and must implement two methods, :code:`reset(self)` and :code:`get(self, handle)`.
@@ -54,7 +54,7 @@ In the next example we highlight how to derive from existing observation builder
 Example 2 : Single-agent navigation
--------------
+-------------------------------------
 Observation builder objects can of course derive from existing concrete subclasses of ObservationBuilder.
 For example, it may be useful to extend the TreeObsForRailEnv_ observation builder.
@@ -157,7 +157,7 @@ navigation to target, and shows the path taken as an animation.
 The code examples above appear in the example file `custom_observation_example.py <https://gitlab.aicrowd.com/flatland/flatland/blob/master/examples/custom_observation_example.py>`_. You can run it using :code:`python examples/custom_observation_example.py` from the root folder of the flatland repo.  The two examples are run one after the other.
 Example 3 : Using custom predictors and rendering observation
--------------
+-------------------------------------------------------------
 Because the re-scheduling task of the Flatland-Challenge_ requires some short time planning we allow the possibility to use custom predictors that help predict upcoming conflicts and help agent solve them in a timely manner.
 In the **Flatland Environment** we included an initial predictor ShortestPathPredictorForRailEnv_ to give you an idea what you can do with these predictors.
@@ -291,3 +291,73 @@ We can then use this new observation builder and the renderer to visualize the o
        print("Rewards: ", all_rewards, "  [done=", done, "]")
        env_renderer.render_env(show=True, frames=True, show_observations=True, show_predictions=False)
        time.sleep(0.5)
+How to access environment and agent data for observation builders
+------------------------------------------------------------------
+When building your custom observation builder, you might want to aggregate and define your own features that are different from the raw env data. In this section we introduce how such information can be accessed and how you can build your own features out of them.
+Transitions maps
+~~~~~~~~~~~~~~~~
+The transition maps build the base for all movement in the environment. They contain all the information about allowed transitions for the agent at any given position. Because railway movement is limited to the railway tracks, these are important features for any controller that want to interact with the environment. All functionality and features of a transition map can be found here_
+.. _here:https://gitlab.aicrowd.com/flatland/flatland/blob/master/flatland/core/transition_map.py
+**Get Transitions for cell**
+To access the possible transitions at any given cell there are different possibilites:
+1. You provide a cell position and a orientation in that cell (usually the orientation of the agent) and call :code:`cell_transitions = env.rail.get_transitions(*position, direction)` and in return you get a 4d vector with the transition probability ordered as :code:`[North, East, South, West]` given the initial orientation. The position is a tuple of the form :code:`(x, y)` where :code:`x in [0, height]` and :code:`y in [0, width]`. This can be used for branching in a tree search and when looking for all possible allowed paths of an agent as it will provide a simple way to get the possible trajectories.
+2. When more detailed information about the cell in general is necessary you can also get the full transitions of a cell by calling :code:`transition_int = env.rail.get_full_transitions(*position)`. This will return an :code:`int16` for the cell representing the allowed transitions. To understand the transitions returned it is best to represent it as a binary number :code:`bin(transition_int)`, where the bits have to following meaning: :code:`NN NE NS NW EN EE ES EW SN SE SS SW WN WE WS WW`. For example the binary code :code:`1000 0000 0010 0000`, represents a straigt where an agent facing north can transition north and an agent facing south can transition south and no other transitions are possible. To get a better feeling what the binary representations of the elements look like go to this Link_
+.._Link:https://gitlab.aicrowd.com/flatland/flatland/blob/master/flatland/core/grid/rail_env_grid.py#L29
+These two objects can be used for example to detect switches that are usable by other agents but not the observing agent itself. This can be an important feature when actions have to be taken in order to avoid conflicts.
+.. code-block:: python
+    cell_transitions = self.env.rail.get_transitions(*position, direction)
+    transition_bit = bin(self.env.rail.get_full_transitions(*position))
+    total_transitions = transition_bit.count("1")
+    num_transitions = np.count_nonzero(cell_transitions)
+    # Detect Switches that can only be used by other agents.
+    if total_transitions > 2 > num_transitions:
+        unusable_switch_detected = True
+Agent information
+~~~~~~~~~~~~~~~~~~
+The agents are represented as an agent class and are provided when the environment is instantiated. Because agents can have different properties it is helpful to know how to access this information.
+You can simply acces the three main types of agent information in the following ways with :code:`agent = env.agents[handle]`:
+**Agent basic information**
+All the agent in the initiated environment can be found in the :code:`env.agents` class. Given the index of the agent you have acces to:
+- Agent position :code:`agent.position` which returns the current coordinates :code:`(x, y)` of the agent.
+- Agent target :code:`agent.target`  which returns the target coordinates :code:`(x, y)`.
+- Agent direction :code:`agent.direction` which is an int representing the current orientation :code:`{0: North, 1: East, 2: South, 3: West}`
+- Agent moving :code:`agent.moving` where 0 means the agent is currently not moving and 1 indicates agent is moving.
+**Agent speed information**
+Beyond the basic agent information we can also access more details about the agents type by looking at speed data:
+- Agent max speed :code:`agent.speed_data["speed"]` wich defines the traveling speed when the agent is moving.
+- Agent position fraction :code:``agent.speed_data["position_fraction"]` which is a number between 0 and 1 and inidicates when the move to the next cell will occur. Each speed of an agent is 1 or a smaller fraction. At each :code:`env.step()` the agent moves at its fractional speed forwards any only changes to the next cell when the cumulated fractions are :code:`agent.speed_data["position_fraction"] >= 1.`
+**Agent malfunction information**
+Similar to the speed data you can also access individual data about the malfunctions of an agent. All data is available through :code:`agent.malfunction_data` with:
+- Indication how long the agent is still malfunctioning :code:`'malfunction'` by an integer counting down at each time step. 0 means the agent is ok and can move. 
+- Possion rate at which malfunctions happen for this agent :code:`'malfunction_rate'`
+- Number of steps untill next malfunction will occur :code:`'next_malfunction'`
+- Number of malfunctions an agent have occured for this agent so far :code:`nr_malfunctions'`