Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • flatland/flatland
  • stefan_otte/flatland
  • jiaodaxiaozi/flatland
  • sfwatergit/flatland
  • utozx126/flatland
  • ChenKuanSun/flatland
  • ashivani/flatland
  • minhhoa/flatland
  • pranjal_dhole/flatland
  • darthgera123/flatland
  • rivesunder/flatland
  • thomaslecat/flatland
  • joel_joseph/flatland
  • kchour/flatland
  • alex_zharichenko/flatland
  • yoogottamk/flatland
  • troye_fang/flatland
  • elrichgro/flatland
  • jun_jin/flatland
  • nimishsantosh107/flatland
20 results
Show changes
Showing
with 1133 additions and 306 deletions
flatland.baselines package
==========================
Submodules
----------
flatland.baselines.dueling\_double\_dqn module
----------------------------------------------
.. automodule:: flatland.baselines.dueling_double_dqn
:members:
:undoc-members:
:show-inheritance:
flatland.baselines.model module
-------------------------------
.. automodule:: flatland.baselines.model
:members:
:undoc-members:
:show-inheritance:
Module contents
---------------
.. automodule:: flatland.baselines
:members:
:undoc-members:
:show-inheritance:
flatland.core.grid package
==========================
Submodules
----------
flatland.core.grid.grid4 module
-------------------------------
.. automodule:: flatland.core.grid.grid4
:members:
:undoc-members:
:show-inheritance:
flatland.core.grid.grid4\_astar module
--------------------------------------
.. automodule:: flatland.core.grid.grid4_astar
:members:
:undoc-members:
:show-inheritance:
flatland.core.grid.grid4\_utils module
--------------------------------------
.. automodule:: flatland.core.grid.grid4_utils
:members:
:undoc-members:
:show-inheritance:
flatland.core.grid.grid8 module
-------------------------------
.. automodule:: flatland.core.grid.grid8
:members:
:undoc-members:
:show-inheritance:
flatland.core.grid.grid\_utils module
-------------------------------------
.. automodule:: flatland.core.grid.grid_utils
:members:
:undoc-members:
:show-inheritance:
flatland.core.grid.rail\_env\_grid module
-----------------------------------------
.. automodule:: flatland.core.grid.rail_env_grid
:members:
:undoc-members:
:show-inheritance:
Module contents
---------------
.. automodule:: flatland.core.grid
:members:
:undoc-members:
:show-inheritance:
flatland.core package
=====================
Submodules
----------
flatland.core.env module
------------------------
.. automodule:: flatland.core.env
:members:
:undoc-members:
:show-inheritance:
flatland.core.transitions module
--------------------------------
.. automodule:: flatland.core.transitions
:members:
:undoc-members:
:show-inheritance:
Module contents
---------------
.. automodule:: flatland.core
:members:
:undoc-members:
:show-inheritance:
flatland.envs package
=====================
Submodules
----------
flatland.envs.rail\_env module
------------------------------
.. automodule:: flatland.envs.rail_env
:members:
:undoc-members:
:show-inheritance:
Module contents
---------------
.. automodule:: flatland.envs
:members:
:undoc-members:
:show-inheritance:
flatland package
================
Subpackages
-----------
.. toctree::
flatland.core
flatland.envs
flatland.utils
Submodules
----------
flatland.cli module
-------------------
.. automodule:: flatland.cli
:members:
:undoc-members:
:show-inheritance:
Module contents
---------------
.. automodule:: flatland
:members:
:undoc-members:
:show-inheritance:
flatland.utils package
======================
Module contents
---------------
.. automodule:: flatland.utils
:members:
:undoc-members:
:show-inheritance:
Welcome to flatland's documentation! Welcome to flatland's documentation!
====================================== ======================================
.. include:: ../README.rst
.. toctree:: .. toctree::
:maxdepth: 2 :maxdepth: 2
:caption: Contents: :caption: Contents:
readme 01_readme
installation 03_tutorials_toc
about_flatland 04_specifications_toc
gettingstarted 05_apidoc
modules 06_contributing
FAQ 07_changes
contributing 08_authors
authors 09_faq_toc
10_interface
Indices and tables Indices and tables
================== ==================
......
.. highlight:: shell
============
Installation
============
Software Runtime & Dependencies
-------------------------------
This is the recommended way of installation and running flatland's dependencies.
* Install `Anaconda <https://www.anaconda.com/distribution/>`_ by following the instructions `here <https://www.anaconda.com/distribution/>`_
* Create a new conda environment
.. code-block:: console
$ conda create python=3.6 --name flatland-rl
$ conda activate flatland-rl
* Install the necessary dependencies
.. code-block:: console
$ conda install -c conda-forge cairosvg pycairo
$ conda install -c anaconda tk
Stable release
--------------
To install flatland, run this command in your terminal:
.. code-block:: console
$ pip install flatland-rl
This is the preferred method to install flatland, as it will always install the most recent stable release.
If you don't have `pip`_ installed, this `Python installation guide`_ can guide
you through the process.
.. _pip: https://pip.pypa.io
.. _Python installation guide: http://docs.python-guide.org/en/latest/starting/installation/
From sources
------------
The sources for flatland can be downloaded from the `Gitlab repo`_.
You can clone the public repository:
.. code-block:: console
$ git clone git@gitlab.aicrowd.com:flatland/flatland.git
Once you have a copy of the source, you can install it with:
.. code-block:: console
$ python setup.py install
.. _Gitlab repo: https://gitlab.aicrowd.com/flatland/flatland
Jupyter Canvas Widget
---------------------
If you work with jupyter notebook you need to install the Jupyer Canvas Widget. To install the Jupyter Canvas Widget read also
https://github.com/Who8MyLunch/Jupyter_Canvas_Widget#installation
# PettingZoo
> PettingZoo (https://www.pettingzoo.ml/) is a collection of multi-agent environments for reinforcement learning. We build a pettingzoo interface for flatland.
## Background
PettingZoo is a popular multi-agent environment library (https://arxiv.org/abs/2009.14471) that aims to be the gym standard for Multi-Agent Reinforcement Learning. We list the below advantages that make it suitable for use with flatland
- Works with both rllib (https://docs.ray.io/en/latest/rllib.html) and stable baselines 3 (https://stable-baselines3.readthedocs.io/) using wrappers from Super Suit.
- Clean API (https://www.pettingzoo.ml/api) with additional facilities/api for parallel, saving observation, recording using gym monitor, processing, normalising observations
- Scikit-learn inspired api
e.g.
```python
act = model.predict(obs, deterministic=True)[0]
```
- Parallel learning using literally 2 lines of code to use with stable baselines 3
```python
env = ss.pettingzoo_env_to_vec_env_v0(env)
env = ss.concat_vec_envs_v0(env, 8, num_cpus=4, base_class=stable_baselines3)
```
- Tested and supports various multi-agent environments with many agents comparable to flatland. e.g. https://www.pettingzoo.ml/magent
- Clean interface means we can custom add an experimenting tool like wandb and have full flexibility to save information we want
PettingZoo
==========
..
PettingZoo (https://www.pettingzoo.ml/) is a collection of multi-agent environments for reinforcement learning. We build a pettingzoo interface for flatland.
Background
----------
PettingZoo is a popular multi-agent environment library (https://arxiv.org/abs/2009.14471) that aims to be the gym standard for Multi-Agent Reinforcement Learning. We list the below advantages that make it suitable for use with flatland
* Works with both rllib (https://docs.ray.io/en/latest/rllib.html) and stable baselines 3 (https://stable-baselines3.readthedocs.io/) using wrappers from Super Suit.
* Clean API (https://www.pettingzoo.ml/api) with additional facilities/api for parallel, saving observation, recording using gym monitor, processing, normalising observations
* Scikit-learn inspired api
e.g.
.. code-block:: python
act = model.predict(obs, deterministic=True)[0]
* Parallel learning using literally 2 lines of code to use with stable baselines 3
.. code-block:: python
env = ss.pettingzoo_env_to_vec_env_v0(env)
env = ss.concat_vec_envs_v0(env, 8, num_cpus=4, base_class=’stable_baselines3’)
* Tested and supports various multi-agent environments with many agents comparable to flatland. e.g. https://www.pettingzoo.ml/magent
* Clean interface means we can custom add an experimenting tool like wandb and have full flexibility to save information we want
# Environment Wrappers
> We provide various environment wrappers to work with both the rail env and the petting zoo interface.
## Background
These wrappers changes certain environment behavior which can help to get better reinforcement learning training.
## Supported Inbuilt Wrappers
We provide 2 sample wrappers for ShortestPathAction wrapper and SkipNoChoice wrapper. The wrappers requires many env properties that are only created on environment reset. Hence before using the wrapper, we must reset the rail env. To use the wrappers, simply pass the resetted rail env. Code samples are shown below for each wrapper.
### ShortestPathAction Wrapper
To use the ShortestPathAction Wrapper, simply wrap the rail env as follows
```python
rail_env.reset(random_seed=1)
rail_env = ShortestPathActionWrapper(rail_env)
```
The shortest path action wrapper maps the existing action space into 3 actions - Shortest Path (`0`), Next Shortest Path (`1`) and Stop (`2`). Hence, we must ensure that the predicted action should always be one of these (0, 1 and 2) actions. To route all agents in the shortest path, pass `0` as the action.
### SkipNoChoice Wrapper
To use the SkipNoChoiceWrapper, simply wrap the rail env as follows
```python
rail_env.reset(random_seed=1)
rail_env = SkipNoChoiceCellsWrapper(rail_env, accumulate_skipped_rewards=False, discounting=0.0)
```
Environment Wrappers
====================
..
We provide various environment wrappers to work with both the rail env and the petting zoo interface.
Background
----------
These wrappers changes certain environment behavior which can help to get better reinforcement learning training.
Supported Inbuilt Wrappers
--------------------------
We provide 2 sample wrappers for ShortestPathAction wrapper and SkipNoChoice wrapper. The wrappers requires many env properties that are only created on environment reset. Hence before using the wrapper, we must reset the rail env. To use the wrappers, simply pass the resetted rail env. Code samples are shown below for each wrapper.
ShortestPathAction Wrapper
^^^^^^^^^^^^^^^^^^^^^^^^^^
To use the ShortestPathAction Wrapper, simply wrap the rail env as follows
.. code-block:: python
rail_env.reset(random_seed=1)
rail_env = ShortestPathActionWrapper(rail_env)
The shortest path action wrapper maps the existing action space into 3 actions - Shortest Path (\ ``0``\ ), Next Shortest Path (\ ``1``\ ) and Stop (\ ``2``\ ). Hence, we must ensure that the predicted action should always be one of these (0, 1 and 2) actions. To route all agents in the shortest path, pass ``0`` as the action.
SkipNoChoice Wrapper
^^^^^^^^^^^^^^^^^^^^
To use the SkipNoChoiceWrapper, simply wrap the rail env as follows
.. code-block:: python
rail_env.reset(random_seed=1)
rail_env = SkipNoChoiceCellsWrapper(rail_env, accumulate_skipped_rewards=False, discounting=0.0)
@ECHO OFF
pushd %~dp0
REM Command file for Sphinx documentation
if "%SPHINXBUILD%" == "" (
set SPHINXBUILD=python -msphinx
)
set SOURCEDIR=.
set BUILDDIR=_build
set SPHINXPROJ=flatland
if "%1" == "" goto help
%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
echo.
echo.The Sphinx module was not found. Make sure you have Sphinx installed,
echo.then set the SPHINXBUILD environment variable to point to the full
echo.path of the 'sphinx-build' executable. Alternatively you may add the
echo.Sphinx directory to PATH.
echo.
echo.If you don't have Sphinx installed, grab it from
echo.http://sphinx-doc.org/
exit /b 1
)
%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS%
goto end
:help
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS%
:end
popd
flatland
========
.. toctree::
:maxdepth: 4
flatland
## Core Specifications
### Environment Class Overview
The Environment class contains all necessary functions for the interactions between the agents and the environment. The base Environment class is derived from rllib.env.MultiAgentEnv (https://github.com/ray-project/ray).
The functions are specific for each realization of Flatland (e.g. Railway, Vaccination,...)
In particular, we retain the rllib interface in the use of the step() function, that accepts a dictionary of actions indexed by the agents handles (returned by get_agent_handles()) and returns dictionaries of observations, dones and infos.
```python
class Environment:
"""Base interface for multi-agent environments in Flatland.
Agents are identified by agent ids (handles).
Examples:
>>> obs, info = env.reset()
>>> print(obs)
{
"train_0": [2.4, 1.6],
"train_1": [3.4, -3.2],
}
>>> obs, rewards, dones, infos = env.step(
action_dict={
"train_0": 1, "train_1": 0})
>>> print(rewards)
{
"train_0": 3,
"train_1": -1,
}
>>> print(dones)
{
"train_0": False, # train_0 is still running
"train_1": True, # train_1 is done
"__all__": False, # the env is not done
}
>>> print(infos)
{
"train_0": {}, # info for train_0
"train_1": {}, # info for train_1
}
"""
def __init__(self):
pass
def reset(self):
"""
Resets the env and returns observations from agents in the environment.
Returns:
obs : dict
New observations for each agent.
"""
raise NotImplementedError()
def step(self, action_dict):
"""
Performs an environment step with simultaneous execution of actions for
agents in action_dict.
Returns observations from agents in the environment.
The returns are dicts mapping from agent_id strings to values.
Parameters
-------
action_dict : dict
Dictionary of actions to execute, indexed by agent id.
Returns
-------
obs : dict
New observations for each ready agent.
rewards: dict
Reward values for each ready agent.
dones : dict
Done values for each ready agent. The special key "__all__"
(required) is used to indicate env termination.
infos : dict
Optional info values for each agent id.
"""
raise NotImplementedError()
def render(self):
"""
Perform rendering of the environment.
"""
raise NotImplementedError()
def get_agent_handles(self):
"""
Returns a list of agents' handles to be used as keys in the step()
function.
"""
raise NotImplementedError()
```
File suppressed by a .gitattributes entry or the file's encoding is unsupported.
## Intro
In a human-readable language, specifications provide
- code base overview (hand-drawn concept)
- key concepts (generators, envs) and how are they linked
- link relevant code base
![Overview](img/UML_flatland.png)
`Diagram Source <https://confluence.sbb.ch/x/pQfsSw>`_
Observation and Action Spaces
----------------------------
This is an introduction to the three standard observations and the action space of **Flatland**.
Action Space
^^^^^^^^^^^^
Flatland is a railway simulation. Thus the actions of an agent are strongly limited to the railway network. This means that in many cases not all actions are valid.
The possible actions of an agent are
- ``0`` **Do Nothing**: If the agent is moving it continues moving, if it is stopped it stays stopped
- ``1`` **Deviate Left**: If the agent is at a switch with a transition to its left, the agent will chose th eleft path. Otherwise the action has no effect. If the agent is stopped, this action will start agent movement again if allowed by the transitions.
- ``2`` **Go Forward**: This action will start the agent when stopped. This will move the agent forward and chose the go straight direction at switches.
- ``3`` **Deviate Right**: Exactly the same as deviate left but for right turns.
- ``4`` **Stop**: This action causes the agent to stop.
Observation Spaces
^^^^^^^^^^^^^^^^^^
In the **Flatland** environment we have included three basic observations to get started. The figure below illustrates the observation range of the different basic observation: ``Global``, ``Local Grid`` and ``Local Tree``.
.. image:: https://i.imgur.com/oo8EIYv.png
:height: 100
:width: 200
Global Observation
~~~~~~~~~~~~~~~~~~
Gives a global observation of the entire rail environment.
The observation is composed of the following elements:
- transition map array with dimensions (``env.height``, ``env.width``, ``16``), assuming **16 bits encoding of transitions**.
- Two 2D arrays (``map_height``, ``map_width``, ``2``) containing respectively the position of the given agent target and the positions of the other agents' targets.
- A 3D array (``map_height``, ``map_width``, ``8``) with the **first 4 channels** containing the **one hot encoding** of the direction of the given agent and the second 4 channels containing the positions of the other agents at their position coordinates.
We encourage you to enhance this observation with any layer you think might help solve the problem.
It would also be possible to construct a global observation for a super agent that controls all agents at once.
Local Grid Observation
~~~~~~~~~~~~~~~~~~~~~~
Gives a local observation of the rail environment around the agent.
The observation is composed of the following elements:
- transition map array of the local environment around the given agent, with dimensions (``2*view_radius + 1``, ``2*view_radius + 1``, ``16``), assuming **16 bits encoding of transitions**.
- Two 2D arrays (``2*view_radius + 1``, ``2*view_radius + 1``, ``2``) containing respectively, if they are in the agent's vision range, its target position, the positions of the other targets.
- A 3D array (``2*view_radius + 1``, ``2*view_radius + 1``, ``4``) containing the one hot encoding of directions of the other agents at their position coordinates, if they are in the agent's vision range.
- A 4 elements array with one hot encoding of the direction.
Be aware that this observation **does not** contain any clues about target location if target is out of range. Thus navigation on maps where the radius of the observation does not guarantee a visible target at all times will become very difficult.
We encourage you to come up with creative ways to overcome this problem. In the tree observation below we introduce the concept of distance maps.
Tree Observation
~~~~~~~~~~~~~~~~
The tree observation is built by exploiting the graph structure of the railway network. The observation is generated by spanning a **4 branched tree** from the current position of the agent. Each branch follows the allowed transitions (backward branch only allowed at dead-ends) until a cell with multiple allowed transitions is reached. Here the information gathered along the branch is stored as a node in the tree.
The figure below illustrates how the tree observation is built:
1. From Agent location probe all 4 directions (``L:Blue``, ``F:Green``, ``R:Purple``, ``B:Red``) starting with left and start branches when transition is allowed.
1. For each branch walk along the allowed transition until you reach a dead-end, switch or the target destination.
2. Create a node and fill in the node information as stated below.
3. If max depth of tree is not reached and there are possible transitions, start new branches and repeat the steps above.
2. Fill up all non existing branches with -infinity such that tree size is invariant to the number of possible transitions at branching points.
Note that we always start with the left branch according to the agent orientation. Thus the tree observation is independent of the NESW orientation of cells, and only considers the transitions relative to the agent's orientation.
The colors in the figure bellow illustrate what branch the cell belongs to. If there are multiple colors in a cell, this cell is visited by different branches of the tree observation.
The right side of the figure shows the resulting tree of the railway network on the left. Cross means no branch was built. If a node has no children it was either a terminal node (dead-end, max depth reached or no transition possible). A circle indicates a node filled with the corresponding information stated below in Node Information.
.. image:: https://i.imgur.com/sGBBhzJ.png
:height: 100
:width: 200
Node Information
~~~~~~~~~~~~~~~~
Each node is filled with information gathered along the path to the node. Currently each node contains 9 features:
- 1: if own target lies on the explored branch the current distance from the agent in number of cells is stored.
- 2: if another agent's target is detected, the distance in number of cells from the current agent position is stored.
- 3: if another agent is detected, the distance in number of cells from the current agent position is stored.
- 4: possible conflict detected (This only works when we use a predictor and will not be important in this tutorial)
- 5: if an unusable switch (for the agent) is detected we store the distance. An unusable switch is a switch where the agent does not have any choice of path, but other agents coming from different directions might.
- 6: This feature stores the distance (in number of cells) to the next node (e.g. switch or target or dead-end)
- 7: minimum remaining travel distance from this node to the agent's target given the direction of the agent if this path is chosen
- 8: agent in the same direction found on path to node
- ``n`` = number of agents present in the same direction (possible future use: number of other agents in the same direction in this branch)
- ``0`` = no agent present in the same direction
- 9: agent in the opposite direction on path to node
- ``n`` = number of agents present in the opposite direction to the observing agent
- ``0`` = no agent present in other direction to the observing agent
This diff is collapsed.
## Rendering Specifications
### Scope
This doc specifies the software to meet the requirements in the Visualization requirements doc.
### References
- [Visualization Requirements](visualization)
- [Core Spec](./core)
### Interfaces
#### Interface with Environment Component
- Environment produces the Env Snapshot data structure (TBD)
- Renderer reads the Env Snapshot
- Connection between Env and Renderer, either:
- Environment “invokes” the renderer in-process
- Renderer “connects” to the environment
- Eg Env acts as a server, Renderer as a client
- Either
- The Env sends a Snapshot to the renderer and waits for rendering
- Or:
- The Env puts snapshots into a rendering queue
- The renderer blocks / waits on the queue, waiting for a new snapshot to arrive
- If several snapshots are waiting, delete and skip them and just render the most recent
- Delete the snapshot after rendering
- Optionally
- Render every frame / time step
- Or, render frames without blocking environment
- Render frames in separate process / thread
##### Environment Snapshot
#### Data Structure
A definitions of the data structure is to be defined in Core requirements or Interfaces doc.
##### Example only
Top-level dictionary
- World nd-array
- Each element represents available transitions in a cell
- List of agents
- Agent location, orientation, movement (forward / stop / turn?)
- Observation
- Rectangular observation
- Maybe just dimensions - width + height (ie no need for contents)
- Can be highlighted in display as per minigrid
- Tree-based observation
- TBD
#### Existing Tools / Libraries
1. Pygame
1. Very easy to use. Like dead simple to add sprites etc. [Link](https://studywolf.wordpress.com/2015/03/06/arm-visualization-with pygame/)
2. No inbuilt support for threads/processes. Does get faster if using pypy/pysco.
2. PyQt
1. Somewhat simple, a little more verbose to use the different modules.
2. Multi-threaded via QThread! Yay! (Doesn’t block main thread that does the real work), [Link](https://nikolak.com/pyqt-threading-tutorial/)
##### How to structure the code
1. Define draw functions/classes for each primitive
1. Primitives: Agents (Trains), Railroad, Grass, Houses etc.
2. Background. Initialize the background before starting the episode.
1. Static objects in the scenes, directly draw those primitives once and cache.
##### Proposed Interfaces
To-be-filled
#### Technical Graphics Considerations
##### Overlay dynamic primitives over the background at each time step.
No point trying to figure out changes. Need to explicitly draw every primitive anyways (that’s how these renders work).