Skip to content
Snippets Groups Projects
Commit f111b798 authored by hagrid67's avatar hagrid67
Browse files

tweak comments in ACF notebook

parent 646eed3a
No related branches found
No related tags found
No related merge requests found
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
# Unordered Close Following Agents # Unordered Close Following Agents
May of you will be aware that Flatland agents cannot follow each other close behind, unless they are in agent index order, ie Agent 1 can follow Agent 0, but Agent 0 cannot follow Agent 1, unless it leaves a gap of one cell. May of you will be aware that Flatland agents cannot follow each other close behind, unless they are in agent index order, ie Agent 1 can follow Agent 0, but Agent 0 cannot follow Agent 1, unless it leaves a gap of one cell.
We have now provided an update which removes this restriction. It's currently in the branch `323_unordered_cell_locking`. It means that agents (moving at the same speed) can now always follow each other without leaving a gap. We have now provided an update which removes this restriction. It's currently in the branch `323_unordered_cell_locking`. It means that agents (moving at the same speed) can now always follow each other without leaving a gap.
**Why is this a big deal? Or even a deal?** **Why is this a big deal? Or even a deal?**
Many of the OR solutions took advantage of it to send agents in the "correct" index order so that they could make better use of the available space, but we believe it's harder for RL solutions to do the same. Many of the OR solutions took advantage of it to send agents in the "correct" index order so that they could make better use of the available space, but we believe it's harder for RL solutions to do the same.
Think of a chain of agents, in random order, moving in the same direction. For any adjacent pair of agents, there's a 0.5 chance that it is in index order, ie index(A) < index(B) where A is in front of B. So roughly half the adjacent pairs will need to leave a gap and half won't, and the chain of agents will typically be one-third empty space. By removing the restriction, we can keep the agents close together and so move up to 50% more agents through a junction or segment of rail in the same number of steps. Think of a chain of agents, in random order, moving in the same direction. For any adjacent pair of agents, there's a 0.5 chance that it is in index order, ie index(A) < index(B) where A is in front of B. So roughly half the adjacent pairs will need to leave a gap and half won't, and the chain of agents will typically be one-third empty space. By removing the restriction, we can keep the agents close together and so move up to 50% more agents through a junction or segment of rail in the same number of steps.
**What difference does it make in practice?** **What difference does it make in practice?**
We have run a few tests and it does seem to slightly increase the training performance of existing RL models. We have run a few tests and it does seem to slightly increase the training performance of existing RL models.
**Does the order not matter at all now?** **Does the order not matter at all now?**
Well, yes, a bit. We are still using index order to resolve conflicts between two agents trying to move into the same spot, for example, head-on collisions, or agents "merging" at junctions. Well, yes, a bit. We are still using index order to resolve conflicts between two agents trying to move into the same spot, for example, head-on collisions, or agents "merging" at junctions.
**This sounds boring. Is there anything interesting about it at all?** **This sounds boring. Is there anything interesting about it at all?**
Thanks for reading this far... It was quite interesting to implement. Think of a chain of moving agents in reverse index order. The `env.step()` iterates them from the back of the chain (lowest index) to the front, so when it gets to the front agent, it's already processed all the others. Now suppose the front agent has decided to stop, or is blocked. The env needs to propagate that back through the chain of agents, and none of them can in fact move. You can see how this might get a bit more complicated with "trees" of merging agents etc. And how do we identify a chain at all? Thanks for reading this far... It was quite interesting to implement. Think of a chain of moving agents in reverse index order. The `env.step()` iterates them from the back of the chain (lowest index) to the front, so when it gets to the front agent, it's already processed all the others. Now suppose the front agent has decided to stop, or is blocked. The env needs to propagate that back through the chain of agents, and none of them can in fact move. You can see how this might get a bit more complicated with "trees" of merging agents etc. And how do we identify a chain at all?
We did it by storing an agent's position as a graph node, and a movement as a directed edge, using the NetworkX graph library. We create an empty graph for each step, and add the agents into the graph in order, using their (row, column) location for the node. Stationary agents get a self-loop. Agents in an adjacent chain naturally get "connected up". We then use some NetworkX algorithms: We did it by storing an agent's position as a graph node, and a movement as a directed edge, using the NetworkX graph library. We create an empty graph for each step, and add the agents into the graph in order, using their (row, column) location for the node. Stationary agents get a self-loop. Agents in an adjacent chain naturally get "connected up". We then use some NetworkX algorithms:
- `weakly_connected_components` to find the chains. - `weakly_connected_components` to find the chains.
- `selfloop_edges` to find the stopped agents - `selfloop_edges` to find the stopped agents
- `dfs_postorder_nodes` to traverse a chain - `dfs_postorder_nodes` to traverse a chain
- `simple_cycles` to find agents colliding head-on - `simple_cycles` to find agents colliding head-on
We can also display a NetworkX graph very simply, but neatly, using GraphViz (see below). We can also display a NetworkX graph very simply, but neatly, using GraphViz (see below).
**Does it run faster / slower?** **Does it run faster / slower?**
It seems to make almost no difference to the speed. It seems to make almost no difference to the speed.
**How do you handle agents entering the env / spawning?** **How do you handle agents entering the env / spawning?**
For an agent in state READY_TO_DEPART we use a dummy cell of `(-1, agent_id)`. This means that if several agents try to enter the env in the same cell and in the same step, the agent with the lowest index will get to start first. It uses the same rule as above, the agent with the lowest index gets to enter an empty cell ahead of any others. For an agent in state READY_TO_DEPART we use a dummy cell of `(-1, agent_id)`. This means that if several agents try to enter the env in the same cell and in the same step, the agent with the lowest index will get to start first. It uses the same rule as above, the agent with the lowest index gets to enter an empty cell ahead of any others.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
%load_ext autoreload %load_ext autoreload
%autoreload 2 %autoreload 2
from IPython.core import display from IPython.core import display
display.display(display.HTML("<style>.container { width:95% !important; }</style>")) display.display(display.HTML("<style>.container { width:95% !important; }</style>"))
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import networkx as nx import networkx as nx
import PIL import PIL
from IPython import display from IPython import display
import time import time
from matplotlib import pyplot as plt from matplotlib import pyplot as plt
import numpy as np import numpy as np
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from flatland.envs import malfunction_generators as malgen from flatland.envs import malfunction_generators as malgen
from flatland.envs.agent_utils import EnvAgent from flatland.envs.agent_utils import EnvAgent
from flatland.envs import rail_generators as rail_gen from flatland.envs import rail_generators as rail_gen
from flatland.envs import agent_chains as ac from flatland.envs import agent_chains as ac
from flatland.envs.rail_env import RailEnv, RailEnvActions from flatland.envs.rail_env import RailEnv, RailEnvActions
from flatland.envs.persistence import RailEnvPersister from flatland.envs.persistence import RailEnvPersister
from flatland.utils.rendertools import RenderTool from flatland.utils.rendertools import RenderTool
from flatland.utils import env_edit_utils as eeu from flatland.utils import env_edit_utils as eeu
from flatland.utils import jupyter_utils as ju from flatland.utils import jupyter_utils as ju
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Load the test cases ## Load the test cases
For now the test cases are in the same file as the code. First we display them without detecting collisions / conflicts, just the motions. For now the test cases are in the same file as the code. First we display them without detecting collisions / conflicts, just the motions.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
omc = ac.MotionCheck() omc = ac.MotionCheck()
ac.create_test_agents2(omc) ac.create_test_agents2(omc)
rv = ac.render(omc) rv = ac.render(omc)
print(type(rv)) print(type(rv))
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Detect conflicts and re-render ## Detect conflicts and re-render
We colour the nodes to indicate conflicts: We colour the nodes to indicate conflicts:
- Red means stopped - Red means stopped
- Purple means a swap-over conflict (ie head-on collision where the agents are adjacent) - Purple means a swap-over conflict (ie head-on collision where the agents are adjacent)
- Blue means an empty cell where two or more agents are trying to move in. - Blue means an empty cell where two or more agents are trying to move in.
- Magenta means an agent vacating a cell, where two or more other agents are trying to move in. - Magenta means an agent vacating a cell, where two or more other agents are trying to move in.
- Black means no conflict, so an agent will move to the new cell. - Black means no conflict, so an agent will move to the new cell.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
omc.find_conflicts() omc.find_conflicts()
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
gvDot = ac.render(omc) gvDot = ac.render(omc)
gvDot gvDot
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
#for v, dPred in omc.G.pred.items(): #for v, dPred in omc.G.pred.items():
# print (v, dPred) # print (v, dPred)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
#import graphviz #import graphviz
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
#oAG = nx.drawing.nx_agraph.to_agraph(omc.G) #oAG = nx.drawing.nx_agraph.to_agraph(omc.G)
#oAG.layout("dot") #oAG.layout("dot")
#sDot = oAG.to_string() #sDot = oAG.to_string()
#oSrc = graphviz.Source(sDot) #oSrc = graphviz.Source(sDot)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
#sSVG = oSrc._repr_svg_() #sSVG = oSrc._repr_svg_()
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
#display.SVG(sSVG) #display.SVG(sSVG)
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Load a test Env ## Load a test Env
Load an env and invoke the chain checker. Load an env and invoke the chain checker.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
env, envModel = eeu.makeTestEnv("merging_spurs", nAg=10, bUCF=True) env, envModel = eeu.makeTestEnv("merging_spurs", nAg=10, bUCF=True)
oEC = ju.EnvCanvas(env) oEC = ju.EnvCanvas(env)
oEC.show() oEC.show()
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
for i in range(10): for i in range(10):
oEC.step() oEC.step()
oEC.render() oEC.render()
display.display_html(f"<br>Step: {i}\n", raw=True) display.display_html(f"<br>Step: {i}\n", raw=True)
display.display_svg(ac.render(env.motionCheck, horizontal=(i>=3))) display.display_svg(ac.render(env.motionCheck, horizontal=(i>=3)))
time.sleep(0.1) time.sleep(0.1)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
dAgStateFrozen= {0: (1, 11, 1), dAgStateFrozen= {0: (1, 11, 1),
1: (1, 8, 1), 1: (1, 8, 1),
2: (1, 10, 1), 2: (1, 10, 1),
3: (1, 7, 1), 3: (1, 7, 1),
4: (1, 9, 1), 4: (1, 9, 1),
5: (1, 6, 0), 5: (1, 6, 0),
6: (1, 5, 1), 6: (1, 5, 1),
7: (2, 6, 0), 7: (2, 6, 0),
8: (1, 4, 1), 8: (1, 4, 1),
9: (3, 6, 0)} 9: (3, 6, 0)}
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
dAgState={} dAgState={}
for iAg, ag in enumerate(env.agents): for iAg, ag in enumerate(env.agents):
dAgState[iAg] = (*ag.position, ag.direction) dAgState[iAg] = (*ag.position, ag.direction)
assert dAgState == dAgStateFrozen assert dAgState == dAgStateFrozen
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
env.motionCheck.svDeadlocked env.motionCheck.svDeadlocked
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
# Deadlocking agents # Deadlocking agents
We have added deadlock detection in the close-following code. This detects agents which are head-to-head ie facing in opposing directions in adjacent squares, and thus will not be able to move for the rest of the episode. This deadlocked status is propagated to any other agents which are blocked by the opposing pair. We have added deadlock detection in the close-following code. This detects agents which are head-to-head ie facing in opposing directions in adjacent squares, and thus will not be able to move for the rest of the episode. This deadlocked status is propagated to any other agents which are blocked by the opposing pair.
In the example below, agents 0 and 1 collide head on. The other agents are spaced out behind them and collide into them in subsequent steps. In the example below, agents 0 and 1 collide head on. The other agents are spaced out behind them and collide into them in subsequent steps.
The deadlock status is now recorded in element 5 of each agent in the recorded episode. (row, col, dir, status, malfunction, deadlock) The deadlock status is now recorded in element 5 of each agent in the recorded episode. (row, col, dir, status, malfunction, deadlock)
__Bugs / Limitations__
The code does not currently count agents which are deadlocked, if they are also malfunctioning, or choose to stop moving voluntarily.
The code does not detect agents which are about to become deadlocked, because they are heading towards each other on a track with no junctions or relevant targets.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
env, envModel = eeu.makeTestEnv("loop_with_loops", nAg=10, bUCF=True) env, envModel = eeu.makeTestEnv("loop_with_loops", nAg=10, bUCF=True)
oEC = ju.EnvCanvas(env, behaviour=ju.DelayedStartForward(env, nStartDelay=1)) oEC = ju.EnvCanvas(env, behaviour=ju.DelayedStartForward(env, nStartDelay=1))
oEC.show() oEC.show()
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
for i in range(25): for i in range(25):
oEC.step() oEC.step()
oEC.render() oEC.render()
#display.display_html(f"<br>Step: {i}\n", raw=True) #display.display_html(f"<br>Step: {i}\n", raw=True)
#display.display_svg(ac.render(env.motionCheck, horizontal=(i>=3))) #display.display_svg(ac.render(env.motionCheck, horizontal=(i>=3)))
time.sleep(0.1) time.sleep(0.1)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
env.motionCheck.svDeadlocked env.motionCheck.svDeadlocked
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
g3Ep = np.array(env.cur_episode) g3Ep = np.array(env.cur_episode)
g3Ep.shape g3Ep.shape
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
nSteps = g3Ep.shape[0] nSteps = g3Ep.shape[0]
plt.step(range(nSteps), np.sum(g3Ep[:,:,5], axis=1)) plt.step(range(nSteps), np.sum(g3Ep[:,:,5], axis=1))
plt.title("Deadlocked agents") plt.title("Deadlocked agents")
plt.xticks(range(g3Ep.shape[0])) plt.xticks(range(g3Ep.shape[0]))
plt.yticks(range(11)) plt.yticks(range(11))
plt.grid() plt.grid()
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
gnDeadlockExpected = np.array([ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 5, 7, 9, 10, 10, 10, 10]) gnDeadlockExpected = np.array([ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 5, 7, 9, 10, 10, 10, 10])
gnDeadlock = np.sum(g3Ep[:,:,5], axis=1) gnDeadlock = np.sum(g3Ep[:,:,5], axis=1)
assert np.all(gnDeadlock == gnDeadlockExpected), "Deadlocks by step do not match expected values!" assert np.all(gnDeadlock == gnDeadlockExpected), "Deadlocks by step do not match expected values!"
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
cmap = plt.get_cmap("summer") cmap = plt.get_cmap("summer")
nT, nAg, _ = g3Ep.shape nT, nAg, _ = g3Ep.shape
plt.pcolormesh(g3Ep[:,:,5].T, cmap=cmap) plt.pcolormesh(g3Ep[:,:,5].T, cmap=cmap)
plt.xlabel("Time Step") plt.xlabel("Time Step")
plt.xticks(range(0,nT, 2)) plt.xticks(range(0,nT, 2))
plt.ylabel("Agent index") plt.ylabel("Agent index")
plt.yticks(range(nAg)) plt.yticks(range(nAg))
plt.colorbar(ticks=[0,1], fraction=0.018, pad=0.1) plt.colorbar(ticks=[0,1], fraction=0.018, pad=0.1)
plt.title("Deadlock status") plt.title("Deadlock status")
plt.grid() plt.grid()
``` ```
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment