Reinforcement learning environments with musculoskeletal models
Learning how to walk
Our movement originates in the brain. Many neurological disorders, such as Cerebral Palsy, Multiple Sclerosis, or strokes can lead to problems with walking. Treatments are often symptomatic, and it’s often hard to predict outcomes of surgeries. Understanding underlying mechanisms is key to improvement of treatments. This motivates our efforts to model the motor control unit of the brain.
In this challenge, your task is to model the motor control unit in a virtual environment. You are given a musculoskeletal model with 18 muscles to control. At every 10ms you send signals to these muscles to activate or deactivate them. The objective is to walk as far as possible in 5 seconds.
For modelling physics we use OpenSim - a biomechanical physics environment for musculoskeletal simulations. You can read more datails here.
NOTE : There have been a few changes to the API of the grading server. Please update your
osim-rl installation by :
pip install git+https://github.com/kidzik/osim-rl.git
and update your submission script by referring to : (https://github.com/stanfordnmbl/osim-rl/blob/master/scripts/submit.py#L43)[https://github.com/stanfordnmbl/osim-rl/blob/master/scripts/submit.py#L43]
In the meantime if you run into scary looking error messages when using your previous submission scripts, please do not panic !! :D :D !!
This challenge wouldn’t be possible without:
- Stanford Neuromuscular Biomechanics Lab
- Stanford Mobilize Center
- OpenAI gym
- OpenAI http client
- and many other teams, individuals and projects
For more details and queries please contact
Your task is to build a function
f which takes current state
observation (31 dimensional vector) and returns muscle activations
action (18 dimensional vector) in a way that maximizes the reward.
The trial ends either if the pelvis of the model goes below
0.7 meter or if you reach
500 iterations (corresponding to
5 seconds in the virtual environment). Let
N be the length of the trial. Your total reward is simply the position of the pelvis on the
x axis after
N steps. The value is given in centimeters.
After each iteration you get a reward equal to the change of the
x axis of pelvis during this iteration.
You can test your model on your local machine. For submission, you will need to interact with the remote environment: crowdAI sends you the current
observation and you need to send back the action you take in the given state.
You are allowed to:
- Modify objective function for training (eg. extra penalty for falling or moving to fast, reward keeping head at the same level, etc.), by
- Modify the musculoskeletal model for training (eg. constrain the Y axis of pelvis)
- Submit a maximum of one submissions each 6 hours.
Note, that the model trained in your modified environment must still be compatible with the challenge environment.
You are not allowed to:
- Use external datasets (ex. kinematics of people walking)
- Engineer the trajectories/muscle activations by hand
- crowdAI reserves the right to modify challenge rules as required.
The winner will be invited to the 2nd Applied Machine Learning Days at EPFL in Switzerland on January 29 & 30, 2018, with travel and accommodation covered.