Record normlized score
We record both the total** cumulated reward** as well as the noramlized rewards.
Normalized reward = sum(rewards)/(number_of_agents+max_number_of_time_steps)
We want to see how both these behave with evolving agents
We record both the total** cumulated reward** as well as the noramlized rewards.
Normalized reward = sum(rewards)/(number_of_agents+max_number_of_time_steps)
We want to see how both these behave with evolving agents