Cookies help us deliver our services. By using our services, you agree to our use of cookies. Learn more

crowdAI is shutting down - please read our blog post for more information

NIPS 2017: Learning to Run

Reinforcement learning environments with musculoskeletal models


Total of submit reward calculation

Posted by Anton_Pechenko over 4 years ago

After submit client prints Your total reward from this submission: 711.515570

While I just calculated total sum by myself with total_reward = 0.0 …. [next_observation, reward, done, info] = client.env_step(action.tolist()) total_reward += reward

and see 2134.54671077

So, how exactly total reward is calculated?

Posted by Anton_Pechenko  over 4 years ago |  Quote

I found that is average total reward of 3 episodes 2134.54671077 / 3 == 711.515570


Posted by spMohanty  over 4 years ago |  Quote

Hi @Anton,

This could be because of a mismatch between the osim-rl used by the server and your client. Now the server has been modified to throw an error if the versions don’t match.

Apart from that, we do expect minor differences in cumulative reward calculation because of differences in the OS environment. The grader runs on a 64 bit Ubuntu 16.04 instance.