bsuite/baselines/utils/sequence.py · master · AIcrowd / research / bsuite

Apr 02, 2020

Fix sequence buffer bug and add some test coverage. · f779cf56

John Aslanides authored Apr 02, 2020

We weren't resetting the buffer state correctly when draining trajectories.

Note that several unrelated bugs initially masked this bug:
- This bug only shows up in the case where `max_sequence_length` is longer than the episode length, a scenario which is not currently covered by the agent integration tests (they only run against catch: episode len = 10, max_sequence_length=32); I will fix this coverage issue in a follow-up change.
- This bug causes the actor-critic agents to crash on experiments with long episode lengths (e.g. cartpole, mountain_car). These crashes don't show up obviously in high-level benchmarking/analysis (radar plot) due to the fact that crashed runs (i.e. DNFs) don't count as 'failures', and so adversely affect the score; I'll add a separate change to resolve this as well.

PiperOrigin-RevId: 304403772
Change-Id: I9dfc2f1b152737e4b10d8afde681e2dadcc85a6f

f779cf56

Fix sequence buffer bug and add some test coverage.

John Aslanides authored Apr 02, 2020

We weren't resetting the buffer state correctly when draining trajectories.

Note that several unrelated bugs initially masked this bug:
- This bug only shows up in the case where `max_sequence_length` is longer than the episode length, a scenario which is not currently covered by the agent integration tests (they only run against catch: episode len = 10, max_sequence_length=32); I will fix this coverage issue in a follow-up change.
- This bug causes the actor-critic agents to crash on experiments with long episode lengths (e.g. cartpole, mountain_car). These crashes don't show up obviously in high-level benchmarking/analysis (radar plot) due to the fact that crashed runs (i.e. DNFs) don't count as 'failures', and so adversely affect the score; I'll add a separate change to resolve this as well.

PiperOrigin-RevId: 304403772
Change-Id: I9dfc2f1b152737e4b10d8afde681e2dadcc85a6f