Skip to content

GitLab

  • Menu
Projects Groups Snippets
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • Flatland Flatland
  • Project information
    • Project information
    • Activity
    • Labels
    • Planning hierarchy
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 96
    • Issues 96
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 3
    • Merge requests 3
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Flatland
  • FlatlandFlatland
  • Issues
  • #297

Closed
Open
Created Nov 12, 2019 by Erik Nygren@mlerik🚅Owner

Done __all__ rewards wrong

If the environment terminates when max time steps are reached and we keep on calling env.step() then the global reward is returned to alla gents as if they finished.

This behavior is not intended. We should remove the bevhavior and update it. A refactoring on how we terminate the environment when max time steps is reached would be best.

Assignee
Assign to
Time tracking