Commit 63a8c5b0 authored by Eric Hambro's avatar Eric Hambro
Browse files

Add warning for no render on GitLab.

parent 34c38d11
......@@ -145,10 +145,12 @@
``` python
env = gym.make("NetHackChallenge-v0", savedir=None) # (Don't save a recording of the episode)
env.reset() # each reset generates a new dungeon
env.step(1) # move agent '@' north
%% Cell type:markdown id:a7fbc173 tags:
......@@ -267,11 +269,11 @@
Included in the starter kit is a [Torchbeast]( implementation of [IMPALA](, a large scale distributed RL algorithm, adapted for NLE. A similar model was used in the original NLE paper to produce non-trivial learning curves for environments such as NetHackScore-v0.
In the original NLE paper, the agent architecture was as follows:
As can be seen, the model utilized both an agent centric view and a global view, which are both processed with convolutional neural network (CNN) layers. In addition, the blstats are processed with an MLP. Finally, the embeddings are passed into an LSTM to deal with partial observability.
The baseline is almost identical except wit one key difference - we haven added an CNN encoder for the `message` observation. This architecture may provide a promising starting point for development, but the sky is the limit for new ideas! Check out the [](./nethack_baselines/torchbeast/ to get started!
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment