Loading
Cookies help us deliver our services. By using our services, you agree to our use of cookies. Learn more

crowdAI is shutting down - please read our blog post for more information

NIPS 2017: Learning to Run

Reinforcement learning environments with musculoskeletal models


Completed
2154
Submissions
630
Participants
92337
Views

Reward calculation for RunEnv

Posted by reason8.ai over 2 years ago

Hi, I want to dublicate issue that I opened on github: https://github.com/stanfordnmbl/osim-rl/issues/43

I noticed that when you calculate reward you don’t update last_state to current_state. The only place where last_state is updated is the reset method. This means that you don’t return one-step reward, but the total reward up to current episode step. I think you should fix it.

1

Posted by spMohanty  over 2 years ago |  Quote

@mpavlov : Thanks mpavlov for pointing it out. As a note to other participants. We are working on fixing this. We will be making an official announcement soon.

1

Posted by Lukasz_  over 2 years ago |  Quote

The updated environment is available in the branch https://github.com/stanfordnmbl/osim-rl/tree/iss43 please feel free to use it already for training. We are working on updating the grader.

1

Posted by Lukasz_  over 2 years ago |  Quote

Please refer to https://github.com/stanfordnmbl/osim-rl/tree/iss43/docs and https://github.com/stanfordnmbl/osim-rl/issues/43 for details.

1

Posted by cg  over 2 years ago |  Quote

in the branch iss43 the influence of the ligaments is 10e-8, after the commit commented with “Ligament forces penalty tuned down a lot for testing”. Is this how will it remain?

return delta_x - math.sqrt(lig_pen) * 10e-8

1

Posted by Lukasz_  over 2 years ago |  Quote

Yes, we needed to scale down the ligament forces penalty.