pfrl_rainbow authoredpfrl_rainbow authored
NeurIPS 2020: MineRL Competition Rainbow Baseline with PFRL
This repository is a Rainbow baseline submission example with PFRL, originated from the main MineRL Competition submission template and starter kit.
For detailed & latest documentation about the competition/template, see the original template repository.
This repository is a sample of the "Round 1" submission, i.e., the agents are trained locally.
is the entrypoint script for Round 1.
Please ignore train.py
, which will be used in Round 2.
directory contains baseline agent's model weight files trained on MineRLObtainDiamondDenseVectorObf-v0
How to Submit
After signing up the competition, specify your account data in aicrowd.json
See the official doc
for detailed information.
Then you can create a submission by making a tag push to your repository on https://gitlab.aicrowd.com/. Any tag push (where the tag name begins with "submission-") to your repository is considered as a submission.
If everything works out correctly, you should be able to see your score on the competition leaderboard.
About Baseline Algorithm
This baseline consists of two main steps:
- Apply K-means clustering for the action space with the demonstration dataset.
- Apply Rainbow algorithm on the discretized action space.
Each of steps utilizes existing libraries.
K-means in the step 1 is from scikit-learn,
and Rainbow in the spte 2 is from PFRL,
which is a Pytorch-based RL library.
How to Train Baseline Agent on your own
directory contains all you need to train agent locally:
pip install numpy scipy scikit-learn pandas tqdm joblib pfrl
# Don't forget to set this!
export MINERL_DATA_ROOT=<directory you want to store demonstration dataset>
python3 mod/dqn_family.py \
--gpu 0 --env "MineRLObtainDiamondDenseVectorObf-v0" \
--outdir result \
--noisy-net-sigma 0.5 --arch distributed_dueling --replay-capacity 300000 --replay-start-size 5000 --target-update-interval 10000 \
--num-step-return 10 --agent CategoricalDoubleDQN --monitor --lr 0.0000625 --adam-eps 0.00015 --prioritized --frame-stack 4 --frame-skip 4 \
--gamma 0.99 --batch-accumulator mean
The quick-start kit was authored by Shivam Khandelwal with help from William H. Guss
The competition is organized by the following team:
- William H. Guss (Carnegie Mellon University)
- Mario Ynocente Castro (Preferred Networks)
- Cayden Codel (Carnegie Mellon University)
- Katja Hofmann (Microsoft Research)
- Brandon Houghton (Carnegie Mellon University)
- Noboru Kuno (Microsoft Research)
- Crissman Loomis (Preferred Networks)
- Keisuke Nakata (Preferred Networks)
- Stephanie Milani (University of Maryland, Baltimore County and Carnegie Mellon University)
- Sharada Mohanty (AIcrowd)
- Diego Perez Liebana (Queen Mary University of London)
- Ruslan Salakhutdinov (Carnegie Mellon University)
- Shinya Shiroshita (Preferred Networks)
- Nicholay Topin (Carnegie Mellon University)
- Avinash Ummadisingu (Preferred Networks)
- Manuela Veloso (Carnegie Mellon University)
- Phillip Wang (Carnegie Mellon University)