Loading
Cookies help us deliver our services. By using our services, you agree to our use of cookies. Learn more

crowdAI is shutting down - please read our blog post for more information

Spotify Sequential Skip Prediction Challenge

Predict if users will skip or listen to the music they're streamed


Completed
686
Submissions
526
Participants
29396
Views

56g is the training set

Posted by chuttipayan over 2 years ago

How do you handle such a big data ? are you using any Google cloud environment ?

2

Posted by cddt  over 2 years ago |  Quote

My problem is trying to download it. Would be great if the server supported resuming, or a torrent was available.

1

Posted by xxl  over 2 years ago |  Quote

so big ………

Posted by brianbrost  over 2 years ago |  Quote

We are looking into options for making the dataset easier to download and will keep you updated.

1

Posted by Definitive Turtles  over 2 years ago |  Quote

My problem is trying to download it. Would be great if the server supported resuming, or a torrent was available.

We managed to download it using wget, as it’s able to continue an interrupted download: wget -c –tries=inf [url here]

4

Posted by cddt  over 2 years ago |  Quote

Thanks. I also finally managed with curl and -C - option, wget is also a good option.

Posted by Aleksei S. Popov  over 2 years ago |  Quote

split it in chunks or use file stream input

Posted by tiopon  over 2 years ago |  Quote

i trie muliple times, still the downloaded file is corrupted. the error message >gzip: 20181113_training_set.tar.gz: invalid compressed data–format violated

2