Cookies help us deliver our services. By using our services, you agree to our use of cookies. Learn more

crowdAI is shutting down - please read our blog post for more information

NIPS 2017: Learning to Run

Reinforcement learning environments with musculoskeletal models




Updates for participants: Please read about the latest changes and the logistics of the second round here and here and here(last update November 6th).

Welcome to Learning to Run, one of the 5 official challenges in the NIPS 2017 Competition Track. In this competition, you are tasked with developing a controller to enable a physiologically-based human model to navigate a complex obstacle course as quickly as possible. You are provided with a human musculoskeletal model and a physics-based simulation environment where you can synthesize physically and physiologically accurate motion. Potential obstacles include external obstacles like steps, or a slippery floor, along with internal obstacles like muscle weakness or motor noise. You are scored based on the distance you travel through the obstacle course in a set amount of time.

Our objectives are to:

  • bring Deep Reinforcement Learning to solve problems in medicine,
  • promote open-source tools in RL research (the physics simulator, the RL environment, and the competition platform are all open-source),
  • encourage RL research in computationally complex environments, with stochasticity and highly-dimensional action spaces.

Follow the instructions in the Getting Started guide in the Dataset section of the challenge and visit our github repo to get started!

First Prize – NVIDIA DGX Station™

NVIDIA Station

NVIDIA DGX Station™ is the Fastest Personal Supercomputer for Researchers and Data Scientists.

Computing support – Amazon AWS cloud credits

Amazon AWS has generously agreed to support participants of the challenge with $30,000 worth of cloud credits. The top 100 performers as per the leaderboard on August 13th, 2017, 23:59:59 UTC, received $300 AWS cloud credits.





Your task is to build a function f which takes the current state observation (a 41 dimensional vector) and returns the muscle excitations action (18 dimensional vector) in a way that maximizes the reward. Your total reward is the position of the pelvis on the x axis after the last iteration minus a penalty for using ligament forces. Ligaments are tissues which prevent your joints from bending too much - overusing these tissues leads to injuries, so we want to avoid it. The penalty in the total reward is equal to the sum of forces generated by ligaments over the trial, divided by 1000. For details on evaluation please refer to the Getting Started guide in the Dataset section of the challenge.


In order to avoid overfitting to the training environment, the participants with score > 15 will be asked to resubmit their solutions in the second round of the challenge. The final ranking will be based on results from the second round.

Round 2 Rules:

  • 1) All eligible participants/teams are allowed 5 successful submissions (instead of 3) and upto 2 failed submissions
  • 2) The submitted container will not have access to external network when being graded
  • 3) Each submission container is allowed to use a maximum memory of 5GB
  • 4) Each submission will be evaluated for atleast N=10 simulations.
  • 4.5) In case of a tie between the top-2 participants, we will re-run their submissions with N=20 and the new scores will be used as a tie breaker.
  • 5) Timeout for a submission is 8hours. In case of N > 10, the timeout will be proportionally increased.
  • 6) Each team can use only one account in the second round
  • 7) A team with two or more accounts accepted in the second round is obliged to report this issue to the organizers immediately, before submitting any solution
  • 7.5) To be eligible for the prize as a team, the combined submissions from the accounts of all team members in round 2 has to be less than or equal to upto 5 successful submissions (+2 failed submissions).
  • 8) The winners will be asked to release the code and the trained models of the solution
  • 9) Violation of the rules or other unfair activity may result in disqualification for the prizes

Additional rules:

  • You are not allowed to use external datasets (e.g., kinematics of people walking),
  • NVIDIA teams are not elligible for the first prize,
  • Organizers reserve the right to modify challenge rules as required.


1st - NVIDIA DGX Station™

2nd - NVIDIA Titan Xp

3rd - NVIDIA Titan Xp


  • Invitation to publish articles in the NIPS competition book.
  • Invitation to the 2nd Applied Machine Learning Days at EPFL in Switzerland on January 29 & 30, 2018, with travel and accommodation covered.
  • Invitation to give a research talk at Stanford, with travel and accommodation covered.
  • Reimbursement of travel and accommodation at NIPS 2017

[NVIDIA Station]

NVIDIA DGX Station™ is the Fastest Personal Supercomputer for Researchers and Data Scientists” with the following benefits:

  • Revolutionary form factor - designed for the desk, whisper-quiet
  • Start experimenting in hours, not weeks, powered by DGX Stack
  • Productivity that goes from desk to data center to cloud
  • Breakthrough performance and precision – powered by Volta


Please refer to the Getting Started guide in the Dataset section of the challenge, for more details on how to access the challenge environments, and also for a basic tutorial on how to make your first submission.

We are in the process of compiling the book chapter for the Book on the NIPS Challenge Track this year. But in the meantime, here are some interesting articles and blog posts written by participants :

Contact Us

We strongly encourage you to use the public channels mentioned above for communications between the participants and the organisers. In extreme cases, if there are any queries or comments that you would like to make using a private communication channel, then you can send us an email at :