Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • jack_bruck/baselines
  • rivesunder/baselines
  • xzhaoma/baselines
  • giulia_cantini/baselines
  • sfwatergit/baselines
  • jiaodaxiaozi/baselines
  • flatland/baselines
7 results
Show changes
Commits on Source (314)
Showing
with 249 additions and 183 deletions
*pycache*
*ppo_policy*
torch_training/Nets/
# Default ignored files
/workspace.xml
\ No newline at end of file
MIT License
Copyright (c) 2019 SBB AG and AIcrowd
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
include AUTHORS.md
include CONTRIBUTING.rst
include changelog.md
include LICENSE
include README.md
include requirements_torch_training.txt
recursive-include tests *
recursive-exclude * __pycache__
recursive-exclude * *.py[co]
recursive-include docs *.rst *.md conf.py *.jpg *.png *.gif
# ⚠️ Deprecated repository
This repository is deprecated! Please go to:
#### **https://gitlab.aicrowd.com/flatland/flatland-examples**
## Torch Training
The `torch_training` folder shows an example of how to train agents with a DQN implemented in pytorch.
In the links below you find introductions to training an agent on Flatland:
- Training an agent for navigation ([Introduction](https://gitlab.aicrowd.com/flatland/baselines/blob/master/torch_training/Getting_Started_Training.md))
- Training multiple agents to avoid conflicts ([Introduction](https://gitlab.aicrowd.com/flatland/baselines/blob/master/torch_training/Multi_Agent_Training_Intro.md))
Use this introductions to get used to the Flatland environment. Then build your own predictors, observations and agents to improve the performance even more and solve the most complex environments of the challenge.
With the above introductions you will solve tasks like these and even more...
![Conflict_Avoidance](https://i.imgur.com/AvBHKaD.gif)
## Sequential Agent
This is a very simple baseline to show you have the `complex_level_generator` generates feasible network configurations.
If you run the `run_test.py` file you will see a simple agent that solves the level by sequentially running each agent along its shortest path.
This is very innefficient but it solves all the instances generated by `complex_level_generator`. However when being scored for the AIcrowd competition, this agent fails due to the duration it needs to solve an episode.
Here you see it in action:
![Sequential_Agent](https://i.imgur.com/DsbG6zK.gif)
\ No newline at end of file
from flatland.envs.rail_env import RailEnv
from ray.rllib.env.multi_agent_env import MultiAgentEnv
from flatland.core.env_observation_builder import TreeObsForRailEnv
from flatland.envs.generators import random_rail_generator
class RailEnvRLLibWrapper(RailEnv, MultiAgentEnv):
def __init__(self,
width,
height,
rail_generator=random_rail_generator(),
number_of_agents=1,
obs_builder_object=TreeObsForRailEnv(max_depth=2)):
super(RailEnvRLLibWrapper, self).__init__(width, height, rail_generator,
number_of_agents, obs_builder_object)
def reset(self, regen_rail=True, replace_agents=True):
self.agents_done = []
return super(RailEnvRLLibWrapper, self).reset(regen_rail, replace_agents)
def step(self, action_dict):
obs, rewards, dones, infos = super(RailEnvRLLibWrapper, self).step(action_dict)
d = dict()
r = dict()
o = dict()
# print(self.agents_done)
# print(dones)
for agent, done in dones.items():
if agent not in self.agents_done:
if agent != '__all__':
o[agent] = obs[agent]
r[agent] = rewards[agent]
d[agent] = dones[agent]
# obs.pop(agent_done)
# rewards.pop(agent_done)
# dones.pop(agent_done)
for agent, done in dones.items():
if done and agent != '__all__':
self.agents_done.append(agent)
return o, r, d, infos
def get_agent_handles(self):
return super(RailEnvRLLibWrapper, self).get_agent_handles()
File deleted
run_grid_search.name = "n_agents_results"
run_grid_search.num_iterations = 1002
run_grid_search.hidden_sizes = [32, 32]
run_grid_search.map_width = 15
run_grid_search.map_height = 15
run_grid_search.n_agents = {"grid_search": [1, 2, 3, 4]}
run_grid_search.horizon = 50
from baselines.RailEnvRLLibWrapper import RailEnvRLLibWrapper
import random
import gym
from flatland.envs.generators import complex_rail_generator
import ray.rllib.agents.ppo.ppo as ppo
from ray.rllib.agents.ppo.ppo import PPOAgent
from ray.rllib.agents.ppo.ppo_policy_graph import PPOPolicyGraph
from ray.tune.registry import register_env
from ray.rllib.models import ModelCatalog
from ray.tune.logger import pretty_print
from ray.rllib.models.preprocessors import Preprocessor
import ray
import numpy as np
import gin
from ray import tune
class MyPreprocessorClass(Preprocessor):
def _init_shape(self, obs_space, options):
return (105,)
def transform(self, observation):
return observation # return the preprocessed observation
ModelCatalog.register_custom_preprocessor("my_prep", MyPreprocessorClass)
ray.init()
def train(config, reporter):
print('Init Env')
env_name = f"rail_env_{config['n_agents']}" # To modify if different environments configs are explored.
# Example generate a rail given a manual specification,
# a map of tuples (cell_type, rotation)
transition_probability = [0.5, # empty cell - Case 0
1.0, # Case 1 - straight
1.0, # Case 2 - simple switch
0.3, # Case 3 - diamond drossing
0.5, # Case 4 - single slip
0.5, # Case 5 - double slip
0.2, # Case 6 - symmetrical
0.0] # Case 7 - dead end
# Example generate a random rail
env = RailEnvRLLibWrapper(width=config['map_width'], height=config['map_height'],
rail_generator=complex_rail_generator(nr_start_goal=config["n_agents"], nr_extra=20, min_dist=12),
number_of_agents=config["n_agents"])
register_env(env_name, lambda _: env)
obs_space = gym.spaces.Box(low=-float('inf'), high=float('inf'), shape=(105,))
act_space = gym.spaces.Discrete(4)
# Dict with the different policies to train
policy_graphs = {
f"ppo_policy": (PPOPolicyGraph, obs_space, act_space, {})
}
def policy_mapping_fn(agent_id):
return f"ppo_policy"
agent_config = ppo.DEFAULT_CONFIG.copy()
agent_config['model'] = {"fcnet_hiddens": config['hidden_sizes'], "custom_preprocessor": "my_prep"}
agent_config['multiagent'] = {"policy_graphs": policy_graphs,
"policy_mapping_fn": policy_mapping_fn,
"policies_to_train": list(policy_graphs.keys())}
agent_config["horizon"] = config['horizon']
ppo_trainer = PPOAgent(env=env_name, config=agent_config)
for i in range(100000 + 2):
print("== Iteration", i, "==")
print("-- PPO --")
print(pretty_print(ppo_trainer.train()))
if i % config['save_every'] == 0:
checkpoint = ppo_trainer.save()
print("checkpoint saved at", checkpoint)
reporter(num_iterations_trained=ppo_trainer._iteration)
@gin.configurable
def run_grid_search(name, num_iterations, n_agents, hidden_sizes, save_every,
map_width, map_height, horizon, local_dir):
tune.run(
train,
name=name,
stop={"num_iterations_trained": num_iterations},
config={"n_agents": n_agents,
"hidden_sizes": hidden_sizes, # Array containing the sizes of the network layers
"save_every": save_every,
"map_width": map_width,
"map_height": map_height,
"local_dir": local_dir,
"horizon": horizon # Max number of time steps
},
resources_per_trial={
"cpu": 11,
"gpu": 0.5
},
local_dir=local_dir
)
if __name__ == '__main__':
gin.external_configurable(tune.grid_search)
dir = 'grid_search_configs/n_agents_grid_search'
gin.parse_config_file(dir + '/config.gin')
run_grid_search(local_dir=dir)
{'Test_0':[20,20,20,3],
'Test_1':[10,10,3,4321],
'Test_2':[10,10,5,123],
'Test_3':[50,50,5,21],
'Test_4':[50,50,20,85],
'Test_5':[100,100,5,436],
'Test_6':[100,100,20,6487],
'Test_7':[100,100,50,567],
'Test_8':[100,10,20,3245],
'Test_9':[10,100,20,632]
}
\ No newline at end of file
#ray==0.7.0
gym ==0.12.5
opencv-python==4.1.0.25
#tensorflow==1.13.1
lz4==2.1.10
gin-config==0.1.4
\ No newline at end of file
git+https://gitlab.aicrowd.com/flatland/flatland.git
importlib-metadata>=0.17
importlib_resources>=1.0.2
torch>=1.1.0
\ No newline at end of file
import time
import numpy as np
from utils.misc_utils import RandomAgent, run_test
with open('parameters.txt','r') as inf:
parameters = eval(inf.read())
# Parameter initialization
features_per_node = 9
tree_depth = 3
nodes = 0
for i in range(tree_depth + 1):
nodes += np.power(4, i)
state_size = features_per_node * nodes * 2
action_size = 5
action_dict = dict()
nr_trials_per_test = 100
test_results = []
test_times = []
test_dones = []
agent = RandomAgent(state_size, action_size)
start_time_scoring = time.time()
test_idx = 0
score_board = []
for test_nr in parameters:
current_parameters = parameters[test_nr]
test_score, test_dones, test_time = run_test(current_parameters, agent, test_nr=test_idx)
print('---------')
print(' RESULTS')
print('---------')
print('{} score was {:.3f} with {:.2f}% environments solved. Test took {} Seconds to complete.\n\n\n'.format(
test_nr,
np.mean(test_score), np.mean(test_dones) * 100, test_time))
test_idx += 1
score_board.append([test_score, test_dones, test_times])
# Local Submission Scoring
The files in this repo are supposed to help you score your agents behavior locally.
**WARNING**: This is not the actual submission scoring --> Results will differ from the scores you achieve here. But the scoring setup is very similar to this setup.
**Beta Stage**: The scoring function here is still under development, use with caution.
## Introduction
This repo contains a very basic setup to test your own agent/algorithm on the Flatland scoring setup.
The repo contains 3 important files:
- `generate_tests.py` Pre-generates the test files for faster testing
- `score_tests.py` Scores your agent on the generated test files
- `show_test.py` Shows samples of the generated test files
- `parameters.txt` Parameters for generating the test files --> These differ in the challenge submission scoring
To start the scoring of your agent you need to do the following
## Parameters used for Level generation
| Test Nr. | X-Dim | Y-Dim | Nr. Agents | Random Seed |
|:---------:|:------:|:------:|:-----------:|:------------:|
| Test 0 | 10 | 10 | 1 | 3 |
| Test 1 | 10 | 10 | 3 | 3 |
| Test 2 | 10 | 10 | 5 | 3 |
| Test 3 | 50 | 10 | 10 | 3 |
| Test 4 | 20 | 50 | 10 | 3 |
| Test 5 | 20 | 20 | 15 | 3 |
| Test 6 | 50 | 50 | 10 | 3 |
| Test 7 | 50 | 50 | 40 | 3 |
| Test 8 | 100 | 100 | 10 | 3 |
| Test 9 | 100 | 100 | 50 | 3 |
These can be changed if you like to test your agents behavior on different tests.
## Generate the test files
To generate the set of test files you just have to run `python generate_tests.py`
This generates pickle files of the levels to test on and places them in the corresponding folders.
## Run Test
To run the tests you have to modify the `score_tests.py` file to load your agent and the necessary predictor and observation.
The following lines have to be replaced by you code:
```
# Load your agent
agent = YourAgent
agent.load(Your_Checkpoint)
# Load the necessary Observation Builder and Predictor
predictor = ShortestPathPredictorForRailEnv()
observation_builder = TreeObsForRailEnv(max_depth=tree_depth, predictor=predictor)
```
The agent and the observation builder as well as an observation wrapper can be passed to the test function like this
```
test_score, test_dones, test_time = run_test(current_parameters, agent, observation_builder=your_observation_builder,
observation_wrapper=your_observation_wrapper,
test_nr=test_nr, nr_trials_per_test=10)
```
In order to speed up the test time you can limit the number of trials per test (`nr_trials_per_test=10`). After you have made these changes to the file you can run `python score_tests.py` which will produce an output similiar to this:
```
Running Test_0 with (x_dim,y_dim) = (10,10) and 1 Agents.
Progress: |********************| 100.0% Complete
Test_0 score was -0.380 with 100.00% environments solved. Test took 0.62 Seconds to complete.
Running Test_1 with (x_dim,y_dim) = (10,10) and 3 Agents.
Progress: |********************| 100.0% Complete
Test_1 score was -1.540 with 80.00% environments solved. Test took 2.67 Seconds to complete.
Running Test_2 with (x_dim,y_dim) = (10,10) and 5 Agents.
Progress: |********************| 100.0% Complete
Test_2 score was -2.460 with 80.00% environments solved. Test took 4.48 Seconds to complete.
Running Test_3 with (x_dim,y_dim) = (50,10) and 10 Agents.
Progress: |**__________________| 10.0% Complete
```
The score is computed by
```
score = sum(mean(all_rewards))/max_steps
```
which is the sum over all time steps and the mean over all agents of the rewards. We normalize it by the maximum number of allowed steps for a level size. The max number of allowed steps is
```
max_steps = mult_factor * (env.height+env.width)
```
Where the `mult_factor` is a multiplication factor to allow for more time if difficulty is to high.
The number of solved envs is just the percentage of episodes that terminated with all agents done.
How these two numbers are used to define your final score will be posted on the [flatland page](https://www.aicrowd.com/organizers/sbb/challenges/flatland-challenge)
## Ignore everything in this directory
*
# Except this file
!.gitignore
\ No newline at end of file
## Ignore everything in this directory
*
# Except this file
!.gitignore
\ No newline at end of file
## Ignore everything in this directory
*
# Except this file
!.gitignore
\ No newline at end of file
## Ignore everything in this directory
*
# Except this file
!.gitignore
\ No newline at end of file
## Ignore everything in this directory
*
# Except this file
!.gitignore
\ No newline at end of file
## Ignore everything in this directory
*
# Except this file
!.gitignore
\ No newline at end of file