crowdAI is shutting down - please read our blog post for more information
over 2 years ago
I am finding 57 duplicated probes in the (new?) training subset by checking the RS ids in ‘SNP_subset_detail.txt’. I checked, and they are not duplicated in the file ‘SNP_fullset_detail.txt’. Many map to a minor genotype with a ‘.’ (period) and corresponding to a very sparse coding vector.
First few duplicated are: probe numbers: 274 395 673 888 1015 1040
mapping onto SNPid: rs2792793 rs1676499 rs6666508 rs1529897 rs10210125 rs3768919
This should be, right?