Hi, (Maybe this has been already answered but would like to know about this.) I am trying to find the relation between SampleName (FASTQ files) and Specimen ID/ Individual ID for the ROSMAP dataset. As mentioned on the portal, I merged ROSMAP_clinical.csv with ROSMAP_biospecimen_metadata.csv, but in the end, I was unable to link them to the actual FASTQ files from the dataset. Can someone please guide me and provide me with information on how can I achieve this? Thanks!

Created by Saim Momin momin-saim
Yes, I have this same question, how do I know which samples is say R1743384 from the fastq.gz files listed here? I want to download only a subset of samples and I don't know which individuals are which file?
Hi, I have a relevant question. For example, under the "raw" folder, there is a file named "200226-B11-A_Broad_S3_L007_R1_001.fastq.gz (sync id is syn32164598)". What is the "Broad_S3" after the library id? Was each library split into multiple aliquots, and is the "Broad_S3" just an internal aliquot ID? Looking forward to your insights!
Dear Victor, I have the same question with Saim. You said to merge this file, ROSMAP_snRNAseq_demultiplexed_ID_mapping.csv, which contains mappings between libraryBatch, cellBarcode, and individualID. But the length of sequence in cellBarcode (ATTATCCCAGGACATG-1) and the fastq R1 (TNATGAGCAACTGCCGATGACATACACA) is different. How can we identify the barcode sequence in fastq files? they are stored in R1, R2, or I1? If the barcode are stored in I1, some samples lacks of I1 file. Thank you for your help.
@Melise9 Hi Melise, Please provide the synIDs for the mentioned entities so that I may investigate further. Thank you, Victor Baham
Hello, I'm sorry if the answer is clear, but how do we find the metadata for the snRNA-seq data with regard to gender and/or sex? I am also confused why the study shares that there are more than 465 unique, individual samples, then goes on in subsequent sections to say that only 8 samples are included. Can someone help me with this? Thank you, Melise
Hi Saim, Additionally, you will need to merge this file, [ROSMAP_snRNAseq_demultiplexed_ID_mapping.csv](https://www.synapse.org/#!Synapse:syn34572333), which contains mappings between libraryBatch, cellBarcode, and individualID. Please let me know if you have any additional questions. Kind regards, Victor
Hi Victor, After looking closely I realized that I skipped the merging with assay metadata file. My bad! The issue for several specimen IDs for the a particular query is resolved! Now my question is how do we link the name of the FASTQ file as in Synapse portal with the merged metadata file? My primary assumption would be that - every FASTQ file is associated with the specimenID which can help to establish a link. Am I correct in this direction? Best, Saim
Hi Saim, Thank you for the additional context. To clarify, are you also merging those files with the assay metadata files? Best, Victor
Hi @victor.baham Thanks for your reply. As mentioned in the thread above, I am trying to get the metadata annotation for the ROSMAP dataset. Out of hundreds of ROSMAP synapse Ids, To speak particular consider the below Synapse IDS syn32164319 syn32164389 After downloading the above Synapse IDs it gives me two FASTQ.gz file with the name **190403-B4-A_Broad_S1_L007_R1.fastq.gz** and 1**90403-B4-A_Broad_S1_L007_R2.fastq.gz** My question is how can **associate the metadata of the obtained FASTQ files ** with the resultant merged file obtained from ROSMAP_clinical.csv and ROSMAP_clinical.csv? When I search "190403-B4" in specimen ID column of the merged file, there are several specimen IDs as that of query. How can I specifically find metadata of the files I've downloaded. Any suggestions in this matter would be appreciated. Thanks!
Hi Saim, I would be happy to look into this for you. For clarification purposes, please provide the synIDs of the files you mentioned. Thank you, Victor

How to find relation between Sample name (FASTQ file name) to Specimen ID/Individual ID? page is loading…