Skip to content
Snippets Groups Projects
user avatar
u214892 authored
ee91911e
History

Local Submission Scoring

The files in this repo are supposed to help you score your agents behavior locally.

WARNING: This is not the actual submission scoring --> Results will differ from the scores you achieve here. But the scoring setup is very similar to this setup.

Beta Stage: The scoring function here is still under development, use with caution.

Introduction

This repo contains a very basic setup to test your own agent/algorithm on the Flatland scoring setup. The repo contains 3 important files:

  • generate_tests.py Pre-generates the test files for faster testing
  • score_tests.py Scores your agent on the generated test files
  • show_test.py Shows samples of the generated test files
  • parameters.txt Parameters for generating the test files --> These differ in the challenge submission scoring

To start the scoring of your agent you need to do the following

Parameters used for Level generation

Test Nr. X-Dim Y-Dim Nr. Agents Random Seed
Test 0 10 10 1 3
Test 1 10 10 3 3
Test 2 10 10 5 3
Test 3 50 10 10 3
Test 4 20 50 10 3
Test 5 20 20 15 3
Test 6 50 50 10 3
Test 7 50 50 40 3
Test 8 100 100 10 3
Test 9 100 100 50 3

These can be changed if you like to test your agents behavior on different tests.

Generate the test files

To generate the set of test files you just have to run python generate_tests.py This generates pickle files of the levels to test on and places them in the corresponding folders.

Run Test

To run the tests you have to modify the score_tests.py file to load your agent and the necessary predictor and observation. The following lines have to be replaced by you code:

# Load your agent
agent = YourAgent
agent.load(Your_Checkpoint)

# Load the necessary Observation Builder and Predictor
predictor = ShortestPathPredictorForRailEnv()
observation_builder = TreeObsForRailEnv(max_depth=tree_depth, predictor=predictor)

The agent and the observation builder as well as an observation wrapper can be passed to the test function like this

test_score, test_dones, test_time = run_test(current_parameters, agent, observation_builder=your_observation_builder,
                                             observation_wrapper=your_observation_wrapper,
                                             test_nr=test_nr, nr_trials_per_test=10)

In order to speed up the test time you can limit the number of trials per test (nr_trials_per_test=10). After you have made these changes to the file you can run python score_tests.py which will produce an output similiar to this:

Running Test_0 with (x_dim,y_dim) = (10,10) and 1 Agents.
Progress: |********************| 100.0% Complete 
Test_0 score was -0.380 with 100.00% environments solved. Test took 0.62 Seconds to complete.

Running Test_1 with (x_dim,y_dim) = (10,10) and 3 Agents.
Progress: |********************| 100.0% Complete 
Test_1 score was -1.540 with 80.00% environments solved. Test took 2.67 Seconds to complete.

Running Test_2 with (x_dim,y_dim) = (10,10) and 5 Agents.
Progress: |********************| 100.0% Complete 
Test_2 score was -2.460 with 80.00% environments solved. Test took 4.48 Seconds to complete.

Running Test_3 with (x_dim,y_dim) = (50,10) and 10 Agents.
Progress: |**__________________| 10.0% Complete

The score is computed by

score = sum(mean(all_rewards))/max_steps

which is the sum over all time steps and the mean over all agents of the rewards. We normalize it by the maximum number of allowed steps for a level size. The max number of allowed steps is

max_steps = mult_factor * (env.height+env.width)

Where the mult_factor is a multiplication factor to allow for more time if difficulty is to high.

The number of solved envs is just the percentage of episodes that terminated with all agents done.

How these two numbers are used to define your final score will be posted on the flatland page