Skip to content
Snippets Groups Projects
Commit a447381f authored by Erik Nygren's avatar Erik Nygren :bullettrain_front:
Browse files

Merge branch 'update_rst_docs' into 'master'

Update rst docs

See merge request flatland/flatland!140
parents 4df3a938 88432963
No related branches found
No related tags found
No related merge requests found
...@@ -84,7 +84,8 @@ Note that this simple strategy fails when multiple agents are present, as each a ...@@ -84,7 +84,8 @@ Note that this simple strategy fails when multiple agents are present, as each a
""" """
def __init__(self): def __init__(self):
super().__init__(max_depth=0) super().__init__(max_depth=0)
# We set max_depth=0 in because we only need to look at the current position of the agent to decide what direction is shortest. # We set max_depth=0 in because we only need to look at the current
# position of the agent to decide what direction is shortest.
self.observation_space = [3] self.observation_space = [3]
def reset(self): def reset(self):
...@@ -120,7 +121,8 @@ Note that this simple strategy fails when multiple agents are present, as each a ...@@ -120,7 +121,8 @@ Note that this simple strategy fails when multiple agents are present, as each a
env = RailEnv(width=7, env = RailEnv(width=7,
height=7, height=7,
rail_generator=complex_rail_generator(nr_start_goal=10, nr_extra=1, min_dist=8, max_dist=99999, seed=0), rail_generator=complex_rail_generator(nr_start_goal=10, nr_extra=1, \
min_dist=8, max_dist=99999, seed=0),
number_of_agents=2, number_of_agents=2,
obs_builder_object=SingleAgentNavigationObs()) obs_builder_object=SingleAgentNavigationObs())
...@@ -154,3 +156,138 @@ navigation to target, and shows the path taken as an animation. ...@@ -154,3 +156,138 @@ navigation to target, and shows the path taken as an animation.
The code examples above appear in the example file `custom_observation_example.py <https://gitlab.aicrowd.com/flatland/flatland/blob/master/examples/custom_observation_example.py>`_. You can run it using :code:`python examples/custom_observation_example.py` from the root folder of the flatland repo. The two examples are run one after the other. The code examples above appear in the example file `custom_observation_example.py <https://gitlab.aicrowd.com/flatland/flatland/blob/master/examples/custom_observation_example.py>`_. You can run it using :code:`python examples/custom_observation_example.py` from the root folder of the flatland repo. The two examples are run one after the other.
Example 3 : Using custom predictors and rendering observation
--------------
Because the re-scheduling task of the Flatland-Challenge_ requires some short time planning we allow the possibility to use custom predictors that help predict upcoming conflicts and help agent solve them in a timely manner.
In the **Flatland Environment** we included an initial predictor ShortestPathPredictorForRailEnv_ to give you an idea what you can do with these predictors.
Any custom predictor can be passed to the observation builder and then be used to build the observation. In this example_ we illustrate how an observation builder can be used to detect conflicts using a predictor.
The observation is incomplete as it only contains information about potential conflicts and has no feature about the agent objectives.
In addition to using your custom predictor you can also make your custom observation ready for rendering. (This can be done in a similar way for your predictor).
All you need to do in order to render your custom observation is to populate :code:`self.env.dev_obs_dict[handle]` for every agent (all handles). (For the predictor use :code:`self.env.dev_pred_dict[handle]`).
In contrast to the previous examples we also implement the :code:`def get_many(self, handles=None)` function for this custom observation builder. The reasoning here is that we want to call the predictor only once per :code:`env.step()`. The base implementation of :code:`def get_many(self, handles=None)` will call the :code:`get(handle)` function for all handles, which mean that it normally does not need to be reimplemented, except for cases as the one below.
.. _ShortestPathPredictorForRailEnv: https://gitlab.aicrowd.com/flatland/flatland/blob/master/flatland/envs/predictions.py#L81
.. _example: https://gitlab.aicrowd.com/flatland/flatland/blob/master/examples/custom_observation_example.py#L110
.. code-block:: python
class ObservePredictions(TreeObsForRailEnv):
"""
We use the provided ShortestPathPredictor to illustrate the usage of predictors in your custom observation.
We derive our observation builder from TreeObsForRailEnv, to exploit the existing implementation to compute
the minimum distances from each grid node to each agent's target.
This is necessary so that we can pass the distance map to the ShortestPathPredictor
Here we also want to highlight how you can visualize your observation
"""
def __init__(self, predictor):
super().__init__(max_depth=0)
self.observation_space = [10]
self.predictor = predictor
def reset(self):
# Recompute the distance map, if the environment has changed.
super().reset()
def get_many(self, handles=None):
'''
Because we do not want to call the predictor seperately for every agent we implement the get_many function
Here we can call the predictor just ones for all the agents and use the predictions to generate our observations
:param handles:
:return:
'''
self.predictions = self.predictor.get(custom_args={'distance_map': self.distance_map})
self.predicted_pos = {}
for t in range(len(self.predictions[0])):
pos_list = []
for a in handles:
pos_list.append(self.predictions[a][t][1:3])
# We transform (x,y) coodrinates to a single integer number for simpler comparison
self.predicted_pos.update({t: coordinate_to_position(self.env.width, pos_list)})
observations = {}
# Collect all the different observation for all the agents
for h in handles:
observations[h] = self.get(h)
return observations
def get(self, handle):
'''
Lets write a simple observation which just indicates whether or not the own predicted path
overlaps with other predicted paths at any time. This is useless for the task of navigation but might
help when looking for conflicts. A more complex implementation can be found in the TreeObsForRailEnv class
Each agent recieves an observation of length 10, where each element represents a prediction step and its value
is:
- 0 if no overlap is happening
- 1 where n i the number of other paths crossing the predicted cell
:param handle: handeled as an index of an agent
:return: Observation of handle
'''
observation = np.zeros(10)
# We are going to track what cells where considered while building the obervation and make them accesible
# For rendering
visited = set()
for _idx in range(10):
# Check if any of the other prediction overlap with agents own predictions
x_coord = self.predictions[handle][_idx][1]
y_coord = self.predictions[handle][_idx][2]
# We add every observed cell to the observation rendering
visited.add((x_coord, y_coord))
if self.predicted_pos[_idx][handle] in np.delete(self.predicted_pos[_idx], handle, 0):
# We detect if another agent is predicting to pass through the same cell at the same predicted time
observation[handle] = 1
# This variable will be access by the renderer to visualize the observation
self.env.dev_obs_dict[handle] = visited
return observation
We can then use this new observation builder and the renderer to visualize the observation of each agent.
.. code-block:: python
# Initiate the Predictor
CustomPredictor = ShortestPathPredictorForRailEnv(10)
# Pass the Predictor to the observation builder
CustomObsBuilder = ObservePredictions(CustomPredictor)
# Initiate Environment
env = RailEnv(width=10,
height=10,
rail_generator=complex_rail_generator(nr_start_goal=5, nr_extra=1, min_dist=8, max_dist=99999, seed=0),
number_of_agents=3,
obs_builder_object=CustomObsBuilder)
obs = env.reset()
env_renderer = RenderTool(env, gl="PILSVG")
# We render the initial step and show the obsered cells as colored boxes
env_renderer.render_env(show=True, frames=True, show_observations=True, show_predictions=False)
action_dict = {}
for step in range(100):
for a in range(env.get_num_agents()):
action = np.random.randint(0, 5)
action_dict[a] = action
obs, all_rewards, done, _ = env.step(action_dict)
print("Rewards: ", all_rewards, " [done=", done, "]")
env_renderer.render_env(show=True, frames=True, show_observations=True, show_predictions=False)
time.sleep(0.5)
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment