Feedback : Mark-Oliver
We don't store the agents in the shared, global memory (grid). In consquence we have to loop over agents to many times:
For all agents, we have to test when updating in the step function... rail_env.py: cell_free = not np.any(np.equal(new_position, [agent2.position for agent2 in self.agents]).all(1))
n = nbr_of_agents This could be done in time O(n) instead of O(n^2)
Solution: we store the agents position in the global grid. We can just lookup the agents similar to the transition map. So we can remove many times the lookup.
Refactoring:
-
introduce a global grid / memory X to store the agents
-
write new function on the env: set agent / remove agent on the grid X : just the agents index, if free -1
-
write new function is free : just a lookup.
-
North, South, West, East navigation method.
Envrionment If we like to get the framework used outside flatland challenge:
- add new environments, like 2d maze , ... Multiagent and search student courses where it could be used as teaching tool