Global observation error reported by participant
Thank you for the pointers. They do help and they show me that the current encoding (for the global observation) seems wrong. For instance, the first channel of the (height, width, 4) map contains the initial direction of the current agent. But zero is both the default value and a valid value for the initial direction (which is a number from 0 to 3). So this encoding is not enough to identify the initial position of each agent.
Besides the logical issue with the encoding (which I don’t think I’m wrong about), another issue I am seeing is that it seems this (height, width, 4) map is not always fully populated for each agent. What I mean is: in the observation of each agent x, I printed all the cells (i,j) which have a non-zero value at any of the 4 channels (in the (height, width, 4) map). There should always be N (N=number of agents) cells printed by this approach, but for some agents this number is less than N (don’t know why).