Cookies help us deliver our services. By using our services, you agree to our use of cookies. Learn more

crowdAI is shutting down - please read our blog post for more information

LifeCLEF 2018 Bird - Soundscape

Bird sounds recognition in soundscapes recordings



Note: This challenge is one of the two subtasks of the LifeCLEF Bird identification challenge 2018. For more information about the other subtask click here . Both challenges share the same training dataset.

Challenge description

The goal of the task is to localize and identify all audible birds within the provided soundscape recordings. Each soundscape is divided into segments of 5 seconds, and a list of species associated to probability scores will have to be returned for each segment. Each prediction item (i.e. each line of the file) has to respect the following format: < MediaId;TC1-TC2;ClassId;probability> where probability is a real value in [0;1] decreasing with the confidence in the prediction, and where TC1-TC2 is a timecode interval with the format of hh:mm:ss with a length of 5 seconds (e.g.: 00:00:00-00:00:05, then 00:00:05-00:00:10).

Here is a short fake run example respecting this format on 3 segments of 5 seconds related to two MediaId: soundscape_fake_run

Each participating group is allowed to submit up to 4 runs built from different methods. Semi-supervised, interactive or crowdsourced approaches are allowed but will be compared independently from fully automatic methods. Any human assistance in the processing of the test queries has therefore to be signaled in the submitted runs.

Participants are allowed to use any of the provided metadata complementary to the audio content (.wav 44.1, 48 kHz or 96 kHz sampling rate), and will also be allowed to use any external training data but at the condition that (i) the experiment is entirely re-producible, i.e. that the used external resource is clearly referenced and accessible to any other research group in the world, (ii) participants submit at least one run without external training data so that we can study the contribution of such resources, (iii) the additional resource does not contain any of the test observations. It is in particular strictly forbidden to crawl training data from: www.xeno-canto.org


The training set contains 36,496 monophone recordings of the Xeno-Canto network covering 1500 species of central and south America (the largest bioacoustic dataset in the literature). It has a massive class imbalance with a minimum of four recordings for Laniocera rufescens and a maximum of 160 recordings for Henicorhina leucophrys. Recordings are associated to various metadata such as the type of sound (call, song, alarm, flight, etc.), the date, the location, textual comments of the authors, multilingual common names and collaborative quality ratings. Complementary to that data, a validation set of soundscapes with time-coded labels will be provided as training data. It contains about 20 minutes of soundscapes representing 240 segments of 5 seconds and with a total of 385 bird species annotations.

The test set itself will contain about 6 hours of soundscapes split in 4382 segments of 5 seconds (to be processed as separate queries). Some of them will be Stereophonic, offering possible sources separation to enhance the recognition.

Submission instructions

As soon as the submission is open, you will find a “Create Submission” button on this page (just next to the tabs)

Results (tables and figures)

(Official round during the LifeCLEF 2018 campaign)


The used metric will be the classification mean Average Precision (c-mAP), considering each class c of the ground truth as a query. This means that for each class c, we will extract from the run file all predictions with ClassId=c, rank them by decreasing probability and compute the average precision for that class. We will then take the mean across all classes. More formally:


where C is the number of species in the ground truth and AveP(c) is the average precision for a given species c computed as:


where k is the rank of an item in the list of the predicted segments containing c, n is the total number of predicted segments containing c, P(k) is the precision at cut-off k in the list, rel(k) is an indicator function equaling 1 if the segment at rank k is a relevant one (i.e. is labeled as containing c in the ground truth) and nrel is the total number of relevant segments for c.


LifeCLEF lab is part of the Conference and Labs of the Evaluation Forum: CLEF 2018. CLEF 2018 consists of independent peer-reviewed workshops on a broad range of challenges in the fields of multilingual and multimodal information access evaluation, and a set of benchmarking activities carried in various labs designed to test different aspects of mono and cross-language Information retrieval systems. More details about the conference can be found here .

Submitting a working note with the full description of the methods used in each run is mandatory. Any run that could not be reproduced thanks to its description in the working notes might be removed from the official publication of the results. Working notes are published within CEUR-WS proceedings, resulting in an assignment of an individual DOI (URN) and an indexing by many bibliography systems including DBLP. According to the CEUR-WS policies, a light review of the working notes will be conducted by LifeCLEF organizing committee to ensure quality. As an illustration, LifeCLEF 2017 working notes (task overviews and participant working notes) can be found within CLEF 2017 CEUR-WS proceedings.


Participants of this challenge will automatically be registered at CLEF 2018. In order to be compliant with the CLEF registration requirements, please edit your profile by providing the following additional information:

  • First name

  • Last name

  • Affiliation

  • Address

  • City

  • Country

  • Regarding the username, please choose a name that represents your team.

This information will not be publicly visible and will be exclusively used to contact you and to send the registration data to CLEF, which is the main organizer of all CLEF labs


Contact us

We strongly encourage you to use the public channels mentioned above for communications between the participants and the organizers. In extreme cases, if there are any queries or comments that you would like to make using a private communication channel, then you can send us an email at :

  • Sharada Prasanna Mohanty: sharada.mohanty@epfl.ch
  • Hervé Glotin: glotin[AT]univ-tln[DOT]fr
  • Hervé Goëau: herve[DOT]goeau[AT]cirad[DOT]fr
  • Alexis Joly: alexis[DOT]joly[AT]inria[DOT]fr
  • Ivan Eggel: ivan[DOT]eggel[AT]hevs[DOT]ch

More information

You can find additional information on the challenge here: http://imageclef.org/node/230

Baseline Repository

You can find a baseline system and a continuative tutorial can be found here: https://github.com/kahst/BirdCLEF-Baseline

We encourage all participants of the challenge to build upon the provided code base and share the results for future reference.

Results (tables and figures)

(Official round during the LifeCLEF 2018 campaign)