Reinforcement learning environments with musculoskeletal models
Our movement originates in the brain. Many neurological disorders, such as Cerebral Palsy, Multiple Sclerosis, or strokes can lead to problems with walking. Treatments are often symptomatic, and it's often hard to predict outcomes of surgeries. Understanding underlying mechanisms is key to improvement of treatments. This motivates our efforts to model the motor control unit of the brain.
In this challenge, your task is to model the motor control unit in a virtual environment. You are given a musculoskeletal model with 18 muscles to control. At every 10ms you send signals to these muscles to activate or deactivate them. The objective is to walk as far as possible in 5 seconds.
For modelling physics we use OpenSim - a biomechanical physics environment for musculoskeletal simulations. You can read more datails here.
NOTE : There have been a few changes to the API of the grading server. Please update your
osim-rl installation by :
pip install git+https://github.com/kidzik/osim-rl.git
and update your submission script by referring to : (https://github.com/stanfordnmbl/osim-rl/blob/master/scripts/submit.py#L43)[https://github.com/stanfordnmbl/osim-rl/blob/master/scripts/submit.py#L43]
In the meantime if you run into scary looking error messages when using your previous submission scripts, please do not panic !! :D :D !!
This challenge wouldn't be possible without:
For more details and queries please contact
Your task is to build a function
f which takes current state
observation (31 dimensional vector) and returns muscle activations
action (18 dimensional vector) in a way that maximizes the reward.
The trial ends either if the pelvis of the model goes below
0.7 meter or if you reach
500 iterations (corresponding to
5 seconds in the virtual environment). Let
N be the length of the trial. Your total reward is simply the position of the pelvis on the
x axis after
N steps. The value is given in centimeters.
After each iteration you get a reward equal to the change of the
x axis of pelvis during this iteration.
You can test your model on your local machine. For submission, you will need to interact with the remote environment: crowdAI sends you the current
observation and you need to send back the action you take in the given state.
You are allowed to:
Note, that the model trained in your modified environment must still be compatible with the challenge environment.
You are not allowed to:
The winner will be invited to the 2nd Applied Machine Learning Days at EPFL in Switzerland on January 29 & 30, 2018, with travel and accommodation covered.