Cookies help us deliver our services. By using our services, you agree to our use of cookies. Learn more

crowdAI is shutting down - please read our blog post for more information

Learning How to Walk

Reinforcement learning environments with musculoskeletal models


Observation vector

Posted by joonho about 3 years ago

The instruction says the observation is 25 dimensional vector. But it seems like the observation is 31 dimensional vector in the code. What is the difference??

Posted by Lukasz_  about 3 years ago |  Quote

Hi, indeed it is a 31 dimensional vector and it has been this size throughout the challenge. Yet, in the previous version we had 25 dimensions and it remained in the text. Sorry for the confusion, we will fix it soon in the instructions.

Best, Łukasz


Posted by shubhamjain0594  about 3 years ago |  Quote

Any description of what each component of vector means is appreciated. To understand one has to dig through the opensim documentation which does not seem very intuitive.


Posted by Lukasz_  about 3 years ago |  Quote

Hi, the observation vector is defined here https://github.com/stanfordnmbl/osim-rl/blob/master/osim/env/human.py#L71-L121 and it basically corresponds to angular positions, velocities and acceleration of joints. There are also parameters of the center of mass and of toes. Joints are ordered the way they are listed when you run the simulation.

Yet, we encourage not to use semantical information of the state. Fine tuning control is a classical approach, which is not robust, hardly adaptable to new models and most likely suboptimal. It would be great to beat this approach with reinforcement learning and solve motor control once for all ;)

Best, Łukasz

Posted by shubhamjain0594  about 3 years ago |  Quote

Yeah, it makes sense.

Just another thing to clarify, if we use any other info to calculate reward which is not a part of state. Then is this approach valid? I feel this destroys the purpose of RL so need your view on this.

Posted by Lukasz_  about 3 years ago |  Quote

Things like extra penalty on tripping and falling are ok, as it still doesn’t change the essence of the problem and can speed up learning a bit. Using external tracking data is not ok :). I will try to make it clearer in the description, thanks for the comment.

Posted by shubhamjain0594  about 3 years ago |  Quote

If I use the X position of the Ankle Joint, this parameter is not a part of observed state but I can still see it in my model. So does it classify as external tracking data?

Posted by Lukasz_  about 3 years ago |  Quote

For the purpose of this competition, please use the observation and potentially any other features derived from the observation, but not from model. In the reward you can also put any function of the history of observations and actions but nothing beyond that.

Yet, outside the competition if you have cool ways to use other data we could work on that :)

Posted by shubhamjain0594  about 3 years ago |  Quote

That clears a lot of things.

Thanks a lot.

Posted by AdamMiltonBarker  about 3 years ago |  Quote

I have removed all email notifications and I am still getting about 5 emails every few minutes from this website, what is going on ?

Posted by Radu_Ionescu  almost 3 years ago |  Quote

Hi, the observation vector is defined here https://github.com/stanfordnmbl/osim-rl/blob/master/osim/env/human.py#L71-L121 and it basically corresponds to angular positions, velocities and acceleration of joints.

I am having the same problem with the data, in understanding what are the observations. You kind of need that if you want to build a new reward function. By looking at that part of the code you can not answer questions like how I get the z coordinate of the left knee and y speed of the right pelvis joint.

Posted by joonho  almost 3 years ago |  Quote

Seems like there’s nothing about acceleration. And I think Opensim is quite slow.. slower than real time. Is this a problem of my PC or Is it also slow for yours?

Posted by Lukasz_  almost 3 years ago |  Quote

Indeed OpenSim will not work real time (it’s super-accurate and it makes it slower than, for example, gaming engines). It does not matter for the competition though. Also, do training without visualization - this may slightly speed up the process.

Regarding the observation vector, we have in osim_model.joints pelvis right hip right knee right ankle left hip left knee left ankle

Now, in https://github.com/stanfordnmbl/osim-rl/blob/master/osim/env/human.py#L71-L121 there are: GetCoordinate(k) gives one of the three dimensions (0-x, 1-y, 2-z) for pelvis (joint 0) getValues gives position getSpeedValue gives speed

for other joints we have just 1 dimension (rotation). getValue gives the rotation and getSpeedValue gives angular velocity

next we have center of mass and its speed. Then, in similar fasion, head, pelvis and feet.

I hope it helps


Posted by ViktorF  almost 3 years ago |  Quote

Python feature definitions from my training code:

import math, collections
import numpy as np

# Coordinate axes from the view of the walking skeleton
# x: forward      getCoordinate(0)
# y: up           getCoordinate(1)
# z: right        getCoordinate(2)

# Observations
Feature = collections.namedtuple('Feature', ('name', 'min', 'max', 'default', 'use'))
    Feature('zero', min=0.0, max=1.0, default=0.0, use=0), # Always zero
    Feature('pelvis_tilt', min=-math.pi/2, max=math.pi/2, default=-0.05, use=1),
    Feature('pelvis_tx', min=-1.0, max=1e6, default=0.0, use=1),
    Feature('pelvis_ty', min=-1.0, max=2.0, default=0.91, use=1), # Simulation stops below 0.7
    Feature('pelvis_tilt_v', min=-10.0*math.pi, max=10.0*math.pi, default=0.0, use=1),
    Feature('pelvis_tx_v', min=-20.0, max=20.0, default=0.0, use=1),
    Feature('pelvis_ty_v', min=-10.0, max=10.0, default=0.0, use=1),
    Feature('hip_flexion_r', min=-2*math.pi/3, max=2*math.pi/3, default=0.0, use=1),
    Feature('knee_angle_r', min=-2*math.pi/3, max=2*math.pi/3, default=0.0, use=1),
    Feature('ankle_angle_r', min=-math.pi/2, max=math.pi/2, default=0.0, use=1),
    Feature('hip_flexion_l', min=-2*math.pi/3, max=2*math.pi/3, default=0.0, use=1),
    Feature('knee_angle_l', min=-2*math.pi/3, max=2*math.pi/3, default=0.0, use=1),
    Feature('ankle_angle_l', min=-math.pi/2, max=math.pi/2, default=0.0, use=1),
    Feature('hip_flexion_r_v', min=-10*math.pi, max=10*math.pi, default=0.0, use=1),
    Feature('knee_angle_r_v', min=-10*math.pi, max=10*math.pi, default=0.0, use=1),
    Feature('ankle_angle_r_v', min=-10*math.pi, max=10*math.pi, default=0.0, use=1),
    Feature('hip_flexion_l_v', min=-10*math.pi, max=10*math.pi, default=0.0, use=1),
    Feature('knee_angle_l_v', min=-10*math.pi, max=10*math.pi, default=0.0, use=1),
    Feature('ankle_angle_l_v', min=-10*math.pi, max=10*math.pi, default=0.0, use=1),
    Feature('mass_center_x', min=-1.0, max=1e6, default=0.0, use=1),
    Feature('mass_center_y', min=-1.0, max=2.0, default=0.97, use=1),
    Feature('mass_center_x_v', min=-20.0, max=20.0, default=0.0, use=1),
    Feature('mass_center_y_v', min=-10.0, max=10.0, default=0.0, use=1),
    Feature('head_x', min=-1.0, max=1e6, default=0.0, use=1),
    Feature('head_y', min=0.0, max=3.0, default=1.54, use=1),
    Feature('pelvis_x', min=-1.0, max=1e6, default=0.0, use=0), # Redundant, equals to pelvis_tx
    Feature('pelvis_y', min=-1.0, max=1.0, default=0.91, use=0), # Redundant, equals to pelvis_ty
    Feature('foot_l_x', min=-1.0, max=1e6, default=0.0, use=1),
    Feature('foot_l_y', min=-1.0, max=2.0, default=0.0, use=1),
    Feature('foot_r_x', min=-1.0, max=1e6, default=0.0, use=1),
    Feature('foot_r_y', min=-1.0, max=2.0, default=0.0, use=1),

FEATURE_NAMES = [feature.name for feature in OBSERVED_FEATURES]
FEATURE_NAME2IDX = dict((name, idx) for idx, name in enumerate(FEATURE_NAMES))
FEATURE_MIN_VALUES = np.array([feature.min for feature in OBSERVED_FEATURES])
FEATURE_MAX_VALUES = np.array([feature.max for feature in OBSERVED_FEATURES])
FEATURE_DEFAULTS = np.array([feature.default for feature in OBSERVED_FEATURES])

Please leave a comment if you find bugs or misunderstandings above. I would be happy to fix it in my code.