Commit 4c4b0b3a authored by Dipam Chakraborty's avatar Dipam Chakraborty
Browse files

folder structure updated

parent 40051264
![Nethack Banner](https://raw.githubusercontent.com/facebookresearch/nle/master/dat/nle/logo.png)
# Nethack Challenge - Starter Kit
# **NeurIPS 2021 - The NetHack Challenge** - Getting started
* **Challenge page** - https://www.aicrowd.com/challenges/neurips-2021-nethack-challenge
* **IMPORTANT - [Accept the rules before you submit](https://www.aicrowd.com/challenges/neurips-2021-nethack-challenge/challenge_rules)**
* **Join the discord server** - https://discord.gg/zkFWQmSWBA
* Clone the starter kit to start competing - TODO Add final starter kit link
👉 [Challenge page](https://www.aicrowd.com/challenges/neurips-2021-nethack-challenge)
💬 [Join the discord server](https://discord.gg/zkFWQmSWBA)
This repository is the Nethack Challenge **Submission template and Starter kit**!
Clone the repository to compete now!
**This repository contains**:
This repository is the Nethack Challenge **Submission template and Starter kit**! It contains:
* **Documentation** on how to submit your models to the leaderboard
* **The procedure** for best practices and information on how we evaluate your agent, etc.
* **Starter code** for you to get started!
* **Baselines** for you to get started with training easily
<p style="text-align:center"><img style="text-align:center" src="https://raw.githubusercontent.com/facebookresearch/nle/master/dat/nle/example_run.gif"></p>
# Table of Contents
1. [Competition Procedure](#competition-procedure)
2. [How to access and use dataset](#how-to-access-and-use-dataset)
3. [How to start participating](#how-to-start-participating)
4. [How do I specify my software runtime / dependencies?](#how-do-i-specify-my-software-runtime-dependencies-)
5. [What should my code structure be like ?](#what-should-my-code-structure-be-like-)
6. [How to make submission](#how-to-make-submission)
7. [Other concepts](#other-concepts)
8. [Important links](#-important-links)
<p style="text-align:center"><img style="text-align:center" src="https://raw.githubusercontent.com/facebookresearch/nle/master/dat/nle/example_run.gif"></p>
# Competition Procedure
The NetHack Learning Environment (NLE) is a Reinforcement Learning environment presented at NeurIPS 2020. NLE is based on NetHack 3.6.6 and designed to provide a standard RL interface to the game, and comes with tasks that function as a first step to evaluate agents on this new environment. You can read more about NLE in the NeurIPS 2020 paper.
We are excited that this competition offers machine learning students, researchers and NetHack-bot builders the opportunity to participate in a grand challenge in AI without prohibitive computational costs—and we are eagerly looking forward to the wide variety of submissions.
**The following is a high level description of how this process works**
![](https://i.imgur.com/xzQkwKV.jpg)
1. **Sign up** to join the competition [on the AIcrowd website](https://www.aicrowd.com/challenges/neurips-2021-nethack-challenge).
2. **Clone** this repo and start developing your solution.
3. **Train** your models on NLE and write rollout code in `rollout.py`.
4. [**Submit**](#how-to-submit-a-model) your trained models to [AIcrowd Gitlab](https://gitlab.aicrowd.com) for evaluation [(full instructions below)](#how-to-submit-a-model). The automated evaluation setup will evaluate the submissions against the NLE environment for a fixed number of rollouts to compute and report the metrics on the leaderboard of the competition.
# How to run the environment
![](https://i.imgur.com/xzQkwKV.jpg)
# Installation - Nethack Learning Environment
Install the environment from the [original nethack repository](https://github.com/facebookresearch/nle)
NLE requires `python>=3.5`, `cmake>=3.14` to be installed and available both when building the
package, and at runtime.
On **MacOS**, one can use `Homebrew` as follows:
``` bash
$ brew install cmake
```
On a plain **Ubuntu 18.04** distribution, `cmake` and other dependencies
can be installed by doing:
```bash
# Python and most build deps
$ sudo apt-get install -y build-essential autoconf libtool pkg-config \
python3-dev python3-pip python3-numpy git flex bison libbz2-dev
# recent cmake version
$ wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | sudo apt-key add -
$ sudo apt-add-repository 'deb https://apt.kitware.com/ubuntu/ bionic main'
$ sudo apt-get update && apt-get --allow-unauthenticated install -y \
cmake \
kitware-archive-keyring
```
Afterwards it's a matter of setting up your environment. We advise using a conda
environment for this:
```bash
$ conda create -n nle python=3.8
$ conda activate nle
$ pip install git+https://github.com/facebookresearch/nle.git@eric/competition --no-binary:nle
```
Find more details on the [original nethack repository](https://github.com/facebookresearch/nle)
# How to start participating
......@@ -63,7 +80,7 @@ Install the environment from the [original nethack repository](https://github.co
You can add your SSH Keys to your GitLab account by going to your profile settings [here](https://gitlab.aicrowd.com/profile/keys). If you do not have SSH Keys, you will first need to [generate one](https://docs.gitlab.com/ee/ssh/README.html#generating-a-new-ssh-key-pair).
2. **Clone the repository**
2. **Clone the repository** - TODO
```
git clone git@github.com:AIcrowd/neurips-2021-nethack-starter-kit.git
......@@ -71,20 +88,22 @@ You can add your SSH Keys to your GitLab account by going to your profile settin
3. **Install** competition specific dependencies!
```
cd neurips-2021-nethack-starter-kit
pip install -r requirements.txt
pip install aicrowd-api
pip install aicrowd-gym
## Install NLE according to the instructions above
```
4. Try out random rollout script in `rollout.py`.
## How do I specify my software runtime / dependencies ?
## How do I specify my software runtime / dependencies ? - TODO
We accept submissions with custom runtime, so you don't need to worry about which libraries or framework to pick from.
The configuration files typically include `requirements.txt` (pypi packages), `environment.yml` (conda environment), `apt.txt` (apt packages) or even your own `Dockerfile`.
The configuration files typically include `requirements.txt` (pypi packages), `apt.txt` (apt packages) or even your own `Dockerfile`.
You can check detailed information about the same in the 👉 [RUNTIME.md](/docs/RUNTIME.md) file.
You can check detailed information about the same in the [RUNTIME.md](/docs/RUNTIME.md) file.
## What should my code structure be like ?
......@@ -96,7 +115,7 @@ The different files and directories have following meaning:
├── aicrowd.json # Submission meta information - like your username
├── apt.txt # Packages to be installed inside docker image
├── requirements.txt # Python packages to be installed
├── rollout.py # Your rollout code
├── rollout.py # Your rollout code - can use a batched agent
├── run.sh # Submission entrypoint
└── utility # The utility scripts to provide smoother experience to you.
├── docker_build.sh
......@@ -130,7 +149,7 @@ The submission entrypoint is a bash script `run.sh`, you can call any arbitrary
👉 [SUBMISSION.md](/docs/SUBMISSION.md)
**Best of Luck** 🎉 🎉
# Other Information
......@@ -142,21 +161,23 @@ To be added.
To be added.
## Contributing
## Contributing? - TODO
To be added
## Contributors
## Contributors - TODO
- [Shivam Khandelwal](https://www.aicrowd.com/participants/shivam)
- [Jyotish Poonganam](https://www.aicrowd.com/participants/jyotish)
- [Dipam chakraborty](https://www.aicrowd.com/participants/dipam)
- [Shivam Khandelwal](https://www.aicrowd.com/participants/shivam)
# 📎 Important links
# 📎 Important links - TODO
💪 &nbsp;Challenge Page: https://www.aicrowd.com/challenges/neurips-2021-nethack-challenge
🗣️ &nbsp;Discussion Forum: https://www.aicrowd.com/challenges/neurips-2021-nethack-challenge/discussion
🏆 &nbsp;Leaderboard: https://www.aicrowd.com/challenges/neurips-2021-nethack-challenge/leaderboards
**Best of Luck** 🎉 🎉
\ No newline at end of file
import aicrowd_gym
import nle
import numpy as np
from tqdm import trange
from custom_wrappers import EarlyTerminationNethack
from batched_env import BactchedEnv
class BatchedAgent:
"""
Simple Batched agent interface
Main motivation is to speedup runs by increasing gpu utilization
"""
def __init__(self, num_envs):
"""
Setup your model
Load your weights etc
"""
self.num_envs = num_envs
def preprocess_observations(self, observations, rewards, dones, infos):
"""
Add any preprocessing steps, for example rerodering/stacking for torch/tf in your model
"""
pass
def batched_step(self):
"""
Return a list of actions
"""
pass
class RandomBatchedAgent(BatchedAgent):
def __init__(self, num_envs, num_actions):
super().__init__(num_envs)
self.num_actions = num_actions
self.seeded_state = np.random.RandomState(42)
def preprocess_observations(self, observations, rewards, dones, infos):
return observations, rewards, dones, infos
def batched_step(self, observations, rewards, dones, infos):
rets = self.preprocess_observations(observations, rewards, dones, infos)
observations, rewards, dones, infos = rets
actions = self.seeded_state.randint(self.num_actions, size=self.num_envs)
return actions
if __name__ == '__main__':
def nethack_make_fn():
# These settings will be fixed by the aicrowd evaluator
env = aicrowd_gym.make('NetHackChallenge-v0',
observation_keys=("glyphs",
"chars",
"colors",
"specials",
"blstats",
"message",
"tty_chars",
"tty_colors",
"tty_cursor",))
# This wrapper will always be added on the aicrowd evaluator
env = EarlyTerminationNethack(env=env,
minimum_score=1000,
cutoff_timesteps=50000)
# Add any wrappers you need
return env
# Change the num_envs as you need, for example reduce if your GPU doesn't fit
# but increasing above 32 is not advisable for the Nethack Challenge 2021
num_envs = 16
batched_env = BactchedEnv(env_make_fn=nethack_make_fn, num_envs=num_envs)
# This part can be left as is
observations = batched_env.batch_reset()
rewards = [0.0 for _ in range(num_envs)]
dones = [False for _ in range(num_envs)]
infos = [{} for _ in range(num_envs)]
# Change this to your agent interface
num_actions = batched_env.envs[0].action_space.n
agent = RandomBatchedAgent(num_envs, num_actions)
# The evaluation setup will automatically stop after the requisite number of rollouts
# But you can change this if you want
for _ in trange(1000000000000):
# Ideally this part can be left unchanged
actions = agent.batched_step(observations, rewards, dones, infos)
observations, rewards, dones, infos = batched_env.batch_step(actions)
for done_idx in np.where(dones)[0]:
observations[done_idx] = batched_env.single_env_reset(done_idx)
## This file is intended to emulate the evaluation on AIcrowd
# IMPORTANT - Differences to expect
# * All the environment's functions are not available
# * The run might be slower than your local run
# * Resources might vary from your local machine
from submission_agent import SubmissionConfig, LocalEvaluationConfig
from rollout import run_batched_rollout
from nethack_baselines.utils.batched_env import BactchedEnv
# Ideally you shouldn't need to change anything below
def add_evaluation_wrappers_fn(env_make_fn):
max_episodes = LocalEvaluationConfig.LOCAL_EVALUATION_NUM_EPISODES
# TOOD: use LOCAL_EVALUATION_NUM_EPISODES for limiting episodes
return env_make_fn
def evaluate():
submission_env_make_fn = SubmissionConfig.submission_env_make_fn
num_envs = SubmissionConfig.NUM_PARALLEL_ENVIRONMENTS
Agent = SubmissionConfig.Submision_Agent
evaluation_env_fn = add_evaluation_wrappers_fn(submission_env_make_fn)
batched_env = BactchedEnv(env_make_fn=evaluation_env_fn,
num_envs=num_envs)
num_envs = batched_env.num_envs
num_actions = batched_env.num_actions
agent = Agent(num_envs, num_actions)
run_batched_rollout(batched_env, agent)
if __name__ == '__main__':
evaluate()
# This is intended as an example of a barebones submission
# Do not that not using BatchedEnv not meet the timeout requirement.
import aicrowd_gym
import nle
def main():
"""
This function will be called for training phase.
"""
# This allows us to limit the features of the environment
# that we don't want participants to use during the submission
env = aicrowd_gym.make("NetHackChallenge-v0")
env.reset()
done = False
episode_count = 0
while episode_count < 200:
_, _, done, _ = env.step(env.action_space.sample())
if done:
episode_count += 1
print(episode_count)
env.reset()
if __name__ == "__main__":
main()
import numpy as np
from nethack_baselines.utils.batched_agent import BatchedAgent
class RandomAgent(BatchedAgent):
def __init__(self, num_envs, num_actions):
super().__init__(num_envs, num_actions)
self.seeded_state = np.random.RandomState(42)
def preprocess_observations(self, observations, rewards, dones, infos):
return observations, rewards, dones, infos
def postprocess_actions(self, actions):
return actions
def batched_step(self, observations, rewards, dones, infos):
rets = self.preprocess_observations(observations, rewards, dones, infos)
observations, rewards, dones, infos = rets
actions = self.seeded_state.randint(self.num_actions, size=self.num_envs)
actions = self.postprocess_actions(actions)
return actions
\ No newline at end of file
class BatchedAgent:
"""
Simple Batched agent interface
Main motivation is to speedup runs by increasing gpu utilization
"""
def __init__(self, num_envs, num_actions):
"""
Setup your model
Load your weights etc
"""
self.num_envs = num_envs
self.num_actions = num_actions
def preprocess_observations(self, observations, rewards, dones, infos):
"""
Add any preprocessing steps, for example rerodering/stacking for torch/tf in your model
"""
pass
def preprocess_actions(self, actions):
"""
Add any postprocessing steps, for example converting to lists
"""
pass
def batched_step(self):
"""
Return a list of actions
"""
pass
import gym
import nle
import numpy as np
from tqdm import trange
from collections.abc import Iterable
......@@ -11,6 +10,7 @@ class BactchedEnv:
"""
self.num_envs = num_envs
self.envs = [env_make_fn() for _ in range(self.num_envs)]
self.num_actions = self.envs[0].action_space.n
# TODO: Can have different settings for each env? Probably not needed for Nethack
def batch_step(self, actions):
......@@ -67,7 +67,7 @@ if __name__ == '__main__':
"tty_colors",
"tty_cursor",))
num_envs = 16
num_envs = 4
batched_env = BactchedEnv(env_make_fn=nethack_make_fn, num_envs=num_envs)
observations = batched_env.batch_reset()
num_actions = batched_env.envs[0].action_space.n
......
import nle
# For your local evaluation, aicrowd_gym is completely identical to gym
import aicrowd_gym
def nethack_make_fn():
# These settings will be fixed by the AIcrowd evaluator
# This allows us to limit the features of the environment
# that we don't want participants to use during the submission
return aicrowd_gym.make('NetHackChallenge-v0',
observation_keys=("glyphs",
"chars",
"colors",
"specials",
"blstats",
"message",
"tty_chars",
"tty_colors",
"tty_cursor",))
\ No newline at end of file
#!/usr/bin/env python
# This file is the entrypoint for your submission
# You can modify this file to include your code or directly call your functions/modules from here.
import aicrowd_gym
import nle
############################################################
## Ideally you shouldn't need to change this file at all ##
############################################################
def main():
import numpy as np
from nethack_baselines.utils.batched_env import BactchedEnv
from submission_agent import SubmissionConfig
def run_batched_rollout(batched_env, agent):
"""
This function will be called for training phase.
This function will be called the rollout
"""
# This allows us to limit the features of the environment
# that we don't want participants to use during the submission
env = aicrowd_gym.make("NetHackChallenge-v0")
num_envs = batched_env.num_envs
# This part can be left as is
observations = batched_env.batch_reset()
rewards = [0.0 for _ in range(num_envs)]
dones = [False for _ in range(num_envs)]
infos = [{} for _ in range(num_envs)]
env.reset()
done = False
episode_count = 0
while episode_count < 20:
_, _, done, _ = env.step(env.action_space.sample())
if done:
# The evaluator will automatically stop after the episodes based on the development/test phase
while episode_count < 10000:
actions = agent.batched_step(observations, rewards, dones, infos)
observations, rewards, dones, infos = batched_env.batch_step(actions)
for done_idx in np.where(dones)[0]:
observations[done_idx] = batched_env.single_env_reset(done_idx)
episode_count += 1
print(episode_count)
env.reset()
print("Episodes Completed :", episode_count)
if __name__ == "__main__":
main()
submission_env_make_fn = SubmissionConfig.submission_env_make_fn
NUM_PARALLEL_ENVIRONMENTS = SubmissionConfig.NUM_PARALLEL_ENVIRONMENTS
Agent = SubmissionConfig.Submision_Agent
batched_env = BactchedEnv(env_make_fn=submission_env_make_fn,
num_envs=NUM_PARALLEL_ENVIRONMENTS)
num_envs = batched_env.num_envs
num_actions = batched_env.num_actions
agent = Agent(num_envs, num_actions)
run_batched_rollout(batched_env, agent)
#!/bin/bash
python agent.py
python rollout.py
from nethack_baselines.random_submission_agent import RandomAgent
# from nethack_baselines.torchbeast_submission_agent import TorchBeastAgent
# from nethack_baselines.rllib_submission_agent import RLlibAgent
from wrappers import addtimelimitwrapper_fn
################################################
# Import your own agent code #
# Set Submision_Agent to your agent #
# Set NUM_PARALLEL_ENVIRONMENTS as needed #
# Set submission_env_make_fn to your wrappers #
# Test with local_evaluation.py #
################################################
class SubmissionConfig:
## Add your own agent class
# Submision_Agent = TorchBeastAgent
# Submision_Agent = RLlibAgent
Submision_Agent = RandomAgent
## Change the NUM_PARALLEL_ENVIRONMENTS as you need
## for example reduce it if your GPU doesn't fit
## Increasing above 32 is not advisable for the Nethack Challenge 2021
NUM_PARALLEL_ENVIRONMENTS = 16
## Add a function that creates your nethack env
## Mainly this is to add wrappers
## Add your wrappers to wrappers.py and change the name here
## IMPORTANT: Don't "call" the function, only provide the name
submission_env_make_fn = addtimelimitwrapper_fn
class LocalEvaluationConfig:
# Change this to locally check a different number of rollouts
# The AIcrowd submission evaluator will not use this
# It is only for your local evaluation
LOCAL_EVALUATION_NUM_EPISODES = 50
from gym.wrappers import TimeLimit
from nethack_baselines.utils.nethack_env_creation import nethack_make_fn
def addtimelimitwrapper_fn():
"""
An example of how to add wrappers to the nethack_make_fn
Should return a gym env which wraps the nethack gym env
"""
env = nethack_make_fn()
env = TimeLimit(env, max_episode_steps=10_000_0000)
return env
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment