Reinforcement learning environments with musculoskeletal models

Completed

Submissions

Participants

Views

Posted by Lukasz_ over 2 years ago | Quote

Hi, indeed it is a 31 dimensional vector and it has been this size throughout the challenge. Yet, in the previous version we had 25 dimensions and it remained in the text. Sorry for the confusion, we will fix it soon in the instructions.

Best, Łukasz

Posted by shubhamjain0594 over 2 years ago | Quote

Any description of what each component of vector means is appreciated. To understand one has to dig through the opensim documentation which does not seem very intuitive.

Thanks.

Posted by Lukasz_ over 2 years ago | Quote

Hi, the observation vector is defined here https://github.com/stanfordnmbl/osim-rl/blob/master/osim/env/human.py#L71-L121 and it basically corresponds to angular positions, velocities and acceleration of joints. There are also parameters of the center of mass and of toes. Joints are ordered the way they are listed when you run the simulation.

Yet, we encourage not to use semantical information of the state. Fine tuning control is a classical approach, which is not robust, hardly adaptable to new models and most likely suboptimal. It would be great to beat this approach with reinforcement learning and solve motor control once for all ;)

Best, Łukasz

Posted by shubhamjain0594 over 2 years ago | Quote

Yeah, it makes sense.

Just another thing to clarify, if we use any other info to calculate reward which is not a part of state. Then is this approach valid? I feel this destroys the purpose of RL so need your view on this.

Posted by Lukasz_ over 2 years ago | Quote

Things like extra penalty on tripping and falling are ok, as it still doesn’t change the essence of the problem and can speed up learning a bit. Using external tracking data is not ok :). I will try to make it clearer in the description, thanks for the comment.

Posted by shubhamjain0594 over 2 years ago | Quote

If I use the X position of the Ankle Joint, this parameter is not a part of observed state but I can still see it in my model. So does it classify as external tracking data?

Posted by Lukasz_ over 2 years ago | Quote

For the purpose of this competition, please use the observation and potentially any other features derived from the observation, but not from model. In the reward you can also put any function of the history of observations and actions but nothing beyond that.

Yet, outside the competition if you have cool ways to use other data we could work on that :)

Posted by AdamMiltonBarker over 2 years ago | Quote

I have removed all email notifications and I am still getting about 5 emails every few minutes from this website, what is going on ?

Posted by Radu_Ionescu over 2 years ago | Quote

Hi, the observation vector is defined here https://github.com/stanfordnmbl/osim-rl/blob/master/osim/env/human.py#L71-L121 and it basically corresponds to angular positions, velocities and acceleration of joints.

I am having the same problem with the data, in understanding what are the observations. You kind of need that if you want to build a new reward function. By looking at that part of the code you can not answer questions like how I get the z coordinate of the left knee and y speed of the right pelvis joint.

Posted by joonho over 2 years ago | Quote

Seems like there’s nothing about acceleration. And I think Opensim is quite slow.. slower than real time. Is this a problem of my PC or Is it also slow for yours?

Posted by Lukasz_ over 2 years ago | Quote

Indeed OpenSim will not work real time (it’s super-accurate and it makes it slower than, for example, gaming engines). It does not matter for the competition though. Also, do training without visualization - this may slightly speed up the process.

Regarding the observation vector, we have in ```
osim_model.joints
pelvis
right hip
right knee
right ankle
left hip
left knee
left ankle
```

Now, in https://github.com/stanfordnmbl/osim-rl/blob/master/osim/env/human.py#L71-L121
there are:
`GetCoordinate(k)`

gives one of the three dimensions (0-x, 1-y, 2-z) for pelvis (joint 0)
`getValues`

gives position
`getSpeedValue`

gives speed

for other joints we have just 1 dimension (rotation). `getValue`

gives the rotation and `getSpeedValue`

gives angular velocity

next we have center of mass and its speed. Then, in similar fasion, head, pelvis and feet.

I hope it helps

Posted by ViktorF over 2 years ago | Quote

Python feature definitions from my training code:

```
import math, collections
import numpy as np
# Coordinate axes from the view of the walking skeleton
# x: forward getCoordinate(0)
# y: up getCoordinate(1)
# z: right getCoordinate(2)
# Observations
Feature = collections.namedtuple('Feature', ('name', 'min', 'max', 'default', 'use'))
OBSERVED_FEATURES = (
Feature('zero', min=0.0, max=1.0, default=0.0, use=0), # Always zero
Feature('pelvis_tilt', min=-math.pi/2, max=math.pi/2, default=-0.05, use=1),
Feature('pelvis_tx', min=-1.0, max=1e6, default=0.0, use=1),
Feature('pelvis_ty', min=-1.0, max=2.0, default=0.91, use=1), # Simulation stops below 0.7
Feature('pelvis_tilt_v', min=-10.0*math.pi, max=10.0*math.pi, default=0.0, use=1),
Feature('pelvis_tx_v', min=-20.0, max=20.0, default=0.0, use=1),
Feature('pelvis_ty_v', min=-10.0, max=10.0, default=0.0, use=1),
Feature('hip_flexion_r', min=-2*math.pi/3, max=2*math.pi/3, default=0.0, use=1),
Feature('knee_angle_r', min=-2*math.pi/3, max=2*math.pi/3, default=0.0, use=1),
Feature('ankle_angle_r', min=-math.pi/2, max=math.pi/2, default=0.0, use=1),
Feature('hip_flexion_l', min=-2*math.pi/3, max=2*math.pi/3, default=0.0, use=1),
Feature('knee_angle_l', min=-2*math.pi/3, max=2*math.pi/3, default=0.0, use=1),
Feature('ankle_angle_l', min=-math.pi/2, max=math.pi/2, default=0.0, use=1),
Feature('hip_flexion_r_v', min=-10*math.pi, max=10*math.pi, default=0.0, use=1),
Feature('knee_angle_r_v', min=-10*math.pi, max=10*math.pi, default=0.0, use=1),
Feature('ankle_angle_r_v', min=-10*math.pi, max=10*math.pi, default=0.0, use=1),
Feature('hip_flexion_l_v', min=-10*math.pi, max=10*math.pi, default=0.0, use=1),
Feature('knee_angle_l_v', min=-10*math.pi, max=10*math.pi, default=0.0, use=1),
Feature('ankle_angle_l_v', min=-10*math.pi, max=10*math.pi, default=0.0, use=1),
Feature('mass_center_x', min=-1.0, max=1e6, default=0.0, use=1),
Feature('mass_center_y', min=-1.0, max=2.0, default=0.97, use=1),
Feature('mass_center_x_v', min=-20.0, max=20.0, default=0.0, use=1),
Feature('mass_center_y_v', min=-10.0, max=10.0, default=0.0, use=1),
Feature('head_x', min=-1.0, max=1e6, default=0.0, use=1),
Feature('head_y', min=0.0, max=3.0, default=1.54, use=1),
Feature('pelvis_x', min=-1.0, max=1e6, default=0.0, use=0), # Redundant, equals to pelvis_tx
Feature('pelvis_y', min=-1.0, max=1.0, default=0.91, use=0), # Redundant, equals to pelvis_ty
Feature('foot_l_x', min=-1.0, max=1e6, default=0.0, use=1),
Feature('foot_l_y', min=-1.0, max=2.0, default=0.0, use=1),
Feature('foot_r_x', min=-1.0, max=1e6, default=0.0, use=1),
Feature('foot_r_y', min=-1.0, max=2.0, default=0.0, use=1),
)
FEATURE_NAMES = [feature.name for feature in OBSERVED_FEATURES]
FEATURE_NAME2IDX = dict((name, idx) for idx, name in enumerate(FEATURE_NAMES))
FEATURE_MIN_VALUES = np.array([feature.min for feature in OBSERVED_FEATURES])
FEATURE_MAX_VALUES = np.array([feature.max for feature in OBSERVED_FEATURES])
FEATURE_MAGNITUDES = FEATURE_MAX_VALUES - FEATURE_MIN_VALUES
FEATURE_DEFAULTS = np.array([feature.default for feature in OBSERVED_FEATURES])
```

Please leave a comment if you find bugs or misunderstandings above. I would be happy to fix it in my code.

The instruction says the observation is 25 dimensional vector. But it seems like the observation is 31 dimensional vector in the code. What is the difference??