Hi, Quick question on the diversity cohorts VCFs. The metadata shows that there should be 604 WGS samples, but in syn51732523 there are only SNP VCFs for 591 indv. Specifically it looks like the missing specimen IDs are; R2310801 R2814322 R3928849 R5143011 R5422277 R5597995 R6163593 R6619132 R6898471 R7829110 R8705042 R9047269 R9609047 R9693165 Were these filtered out due to low quality? Thanks!

Created by Jake Gockley jgockley
Thanks @abby.vanderlinden !
Hey @jgockley , just wanted to let you know that the most recent version of all metadata files for the Diverse Cohorts study has been filtered to just the donors/samples for which we have sequencing data currently available. Sorry for the previous confusion!
Thanks @jaclynbeck !
It looks like the missing samples are missing because they do not have complete metadata yet (for example, they might be missing some biospecimen data so the data files were not added yet). Additional sample data may be added if/when data curators are able to get full information for them. For now, you can ignore any sample info that does not have matching data files.
Thank you @jaclynbeck
Hello! Apologies for the delay. I'm still looking into this to try and get an answer for you. In the mean time, I did notice that the [WGS assay metadata](https://www.synapse.org/#!Synapse:syn51757644) also has 591 individuals instead of 604. My best guess right now is that either the "missing" samples were low quality or were an obvious X/Y chromosome mismatch. I'll try and get a more concrete answer for you soon! Jaclyn

Diversity Cohorts WGS SNP VCFs page is loading…