crowdAI is shutting down - please read our blog post for more information
Building Missing Maps with Machine Learning
By Humanity & Inclusion
about 2 years ago
My submission uses Mask_Rcnn, matterports keras based implementation. Its a single model, resnet 101 backbone, with a single test time augmentation. This challenge really boils down to detecting small objects and objects at boundary.
For the small objects, small anchors are necessary, mine was [8,16,32,64,128]. Even then as the resnet backbone in its first stage down-samples with respect to the image dimension, by factor of 4 ( using stride of 2 in conv and max pool layer), the smaller objects are hard to detect. This can be alleviated by up-sampling the image. I choose to change the stride of the first stage convolution from 2 to 1, so my backbone stride are [2,4,8,16,32]. I also padded the image, this helps for better overlap of the positive and negative region of interest near the border.
This approach got me to the top of the leaderboard, number 1 with best 0.93731 (rounded to .937) precision and 0.957 recall, before the competition deadline of 12 utc monday august 20, 2018. But i am on second position, apparently submissions after the deadline was still ok. I didn’t know that. I think the time keeping there is a little off, the transparent timestamps on the submission time is what I saw. Thats a little sad to see in a EPFL competition. We are lucky to be given such opportunity to participate. Maybe I could have run a sweep with simultaneous submissions , on the Detection thresholds to see which best fits the test data; in hindsight this would have been a better approach, and helped me get to that .938 precision rounding mark. The recall with this approach it seems is good though. I didn’t tweak too much on the trade-off.
The dataset of this challenge is homogeneous. The large and medium instances of buildings have very high precision and recall. Except for dilating stage 5 of the resnet backbone, I did not have much idea on how to improve for detecting buildings above medium size. Looking at the samples I believe the houses to be mostly located in Europe. But generalizing that to houses in other region of the world will likely be problematic. For instances refugee tents will not only be small sizes but of different different spectral signature. So making this dataset of practical use in disaster response will need further fine tuning with additional diverse datatset, if the organizers are looking to do that. I am from Nepal by the way, us sherpas fear that we might be refugees too, due to glacier outburst in the Himalayas. Submission of the best model in time, not after, will likely be of extreme usefulness.
about 2 years ago |
Thanks for sharing your ideas, we’ve actually thought briefly about using mask-rcnn, but we were afraid it wouldn’t manage to find small instances, but and it turns out with your improvements it actually gave far better recall than our unet (it was around 0.952 before shrinking the model so that it would pass the submission process), but .959 (I saw that one for a moment) is amazing, congratz! Did you try to add second level model to score predictions so that the gap between AP and AR could be smaller? I think with that you could have easily break 0.94 (we were able to get 0.942 before shrinking, wonder if with your AR you could get 0.945?)
As far as the deadline go I think @spMohanty will explain that, but on gitter there was a announcement, that last few submissions (made before deadline) had to be killed due to some problems and contestants were allowed to resubmit them.
Thank you so much for competing! And thank you to organizers for this challenge!