Skip to content
GitLab
Projects
Groups
Snippets
Help
Loading...
Help
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
N
neurips2020-flatland-baselines
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
1
Issues
1
List
Boards
Labels
Service Desk
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Analytics
Analytics
CI / CD
Repository
Value Stream
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Flatland
neurips2020-flatland-baselines
Commits
684fefd3
Commit
684fefd3
authored
May 04, 2020
by
MasterScrat
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Fixes and hacks to allow the use of a single policy with multiple agents
parent
7d742c53
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
13 additions
and
5 deletions
+13
-5
envs/flatland/utils/rllib_wrapper.py
envs/flatland/utils/rllib_wrapper.py
+3
-1
envs/flatland_single.py
envs/flatland_single.py
+8
-3
train.py
train.py
+2
-1
No files found.
envs/flatland/utils/rllib_wrapper.py
View file @
684fefd3
...
...
@@ -49,7 +49,9 @@ class FlatlandRllibWrapper(object):
for
agent
,
done
in
dones
.
items
():
if
agent
!=
'__all__'
and
not
agent
in
obs
:
continue
# skip agent if there is no observation
if
agent
not
in
self
.
_agents_done
:
# FIXME the check below should be kept in MARL training
#if agent not in self._agents_done:
if
True
or
agent
not
in
self
.
_agents_done
:
if
agent
!=
'__all__'
:
if
done
:
self
.
_agents_done
.
append
(
agent
)
...
...
envs/flatland_single.py
View file @
684fefd3
...
...
@@ -73,10 +73,15 @@ class FlatlandSingle(gym.Env):
return
env
def
step
(
self
,
action_
dic
t
):
def
step
(
self
,
action_
lis
t
):
# print("="*50)
# print(action_dict)
step_r
=
self
.
_env
.
step
({
0
:
action_dict
})
action_dict
=
{}
for
i
,
action
in
enumerate
(
action_list
):
action_dict
[
i
]
=
action
step_r
=
self
.
_env
.
step
(
action_dict
)
# print(step_r)
# print("="*50)
...
...
@@ -95,7 +100,7 @@ class FlatlandSingle(gym.Env):
# print(foo)
# print("="*50)
return
[
step
for
step
in
foo
.
obs
.
values
()],
return
[
step
for
step
in
foo
.
values
()]
#return foo
@
property
...
...
train.py
View file @
684fefd3
...
...
@@ -45,7 +45,8 @@ def on_episode_end(info):
if
agent_info
[
"agent_done"
]:
episode_done_agents
+=
1
assert
len
(
episode
.
_agent_to_last_info
)
==
episode_num_agents
# Not a valid check when considering a single policy for multiple agents
#assert len(episode._agent_to_last_info) == episode_num_agents
norm_factor
=
1.0
/
(
episode_max_steps
+
episode_num_agents
)
percentage_complete
=
float
(
episode_done_agents
)
/
episode_num_agents
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment