Cookies help us deliver our services. By using our services, you agree to our use of cookies. Learn more

crowdAI is shutting down - please read our blog post for more information

Learning How to Walk

Reinforcement learning environments with musculoskeletal models



Learning how to walk

Our movement originates in the brain. Many neurological disorders, such as Cerebral Palsy, Multiple Sclerosis, or strokes can lead to problems with walking. Treatments are often symptomatic, and it’s often hard to predict outcomes of surgeries. Understanding underlying mechanisms is key to improvement of treatments. This motivates our efforts to model the motor control unit of the brain.

In this challenge, your task is to model the motor control unit in a virtual environment. You are given a musculoskeletal model with 18 muscles to control. At every 10ms you send signals to these muscles to activate or deactivate them. The objective is to walk as far as possible in 5 seconds.

For modelling physics we use OpenSim - a biomechanical physics environment for musculoskeletal simulations. You can read more datails here.

HUMAN environment

NOTE : There have been a few changes to the API of the grading server. Please update your osim-rl installation by : pip install git+https://github.com/kidzik/osim-rl.git and update your submission script by referring to : (https://github.com/stanfordnmbl/osim-rl/blob/master/scripts/submit.py#L43)[https://github.com/stanfordnmbl/osim-rl/blob/master/scripts/submit.py#L43] In the meantime if you run into scary looking error messages when using your previous submission scripts, please do not panic !! :D :D !!


This challenge wouldn’t be possible without:

For more details and queries please contact



Your task is to build a function f which takes current state observation (31 dimensional vector) and returns muscle activations action (18 dimensional vector) in a way that maximizes the reward.

The trial ends either if the pelvis of the model goes below 0.7 meter or if you reach 500 iterations (corresponding to 5 seconds in the virtual environment). Let N be the length of the trial. Your total reward is simply the position of the pelvis on the x axis after N steps. The value is given in centimeters.

After each iteration you get a reward equal to the change of the x axis of pelvis during this iteration.

You can test your model on your local machine. For submission, you will need to interact with the remote environment: crowdAI sends you the current observation and you need to send back the action you take in the given state.


You are allowed to:

  • Modify objective function for training (eg. extra penalty for falling or moving to fast, reward keeping head at the same level, etc.), by
  • Modify the musculoskeletal model for training (eg. constrain the Y axis of pelvis)
  • Submit a maximum of one submissions each 6 hours.

Note, that the model trained in your modified environment must still be compatible with the challenge environment.

You are not allowed to:

  • Use external datasets (ex. kinematics of people walking)
  • Engineer the trajectories/muscle activations by hand


  • crowdAI reserves the right to modify challenge rules as required.


The winner will be invited to the 2nd Applied Machine Learning Days at EPFL in Switzerland on January 29 & 30, 2018, with travel and accommodation covered.


Please refer to the *Getting Started* guide in the Dataset section of the challenge, for more details on how to access the challenge environments, and also for a basic tutorial on how to make your first submission.