Hello, I am specifically working with GENIE patients with Glioblastoma (GBM) that can be primary (first occurrence) or recurrent (oddly, labeled as Metastatic under "Sample.Type", which is incorrect terminology. Although that is not the focus of this question). There are 3 MSK panels included in GENIE v10-- MSK 341, 410, and 468. Each successive panel is an update of the prior. There are many GBM patients whose samples were assayed by 2 of the 3 panels, and based on the "Age at which sequencing was reported" column, it seems that panel 341 was always used before 410 and 468, and 410 before 468. However, both samples are reported as primary tumors. It is highly unlikely that so many MSK patients would have multiple distinct primary GBM, and the time frames suggest that the latter sample/panel result is probably a recurrent GBM, not a primary tumor. However, I don't know how to confirm this. Alternatively, part of the original sample was preserved and re-assayed months or years later by the newer panel. That would explain why the tumor is still reported as primary and the age at sequencing is later in the newer panels. In that case, why is the sample identifier different than the "earlier" sample? Can you please advise? Thank you!

Created by Hia Ghosh hghosh1
Hello! We were able to confirm independently confirm that these MSK samples originating from the same patient likely do not represent distinct cases and are duplicates. We successfully filtered out these duplicates records from our analyses of the GENIE dataset and are happy to answer any questions.
Dear @hghosh1, Thank you for bringing up these concerns! We will let our collaborators know about this discrepancy and let you know of any updates. Best, Chelsea
Thank you so much for your patience, as I still have some concerns. I've attached a subset of GENIE-v10, and it shows that primary and recurrent GENIE-MSK samples _are_ annotated and distinguished from each other. Can you show this to the collaborators? Just to reiterate, the GENIE-MSK patients in question are also available on cbioportal (Clin Cancer Research study from 2019) and can be downloaded directly from the publication's NIH webpage (Johnsson 2019). On both platforms, primary and recurrent GBM tumors are distinguished and labeled. It seems that, at the very least, the samples that are also from the aforementioned study are mislabeled in the GENIE downloads. I appreciate your continued assistance; would it help to see a linked sheet with the discrepant cases (GENIE annotation vs Publication/cbioportal)? Moreover, if it's easier for me to communicate directly with the contributors from MSK, please let me know! X.Patient.Identifier|Sample.Identifier|Age.at.Which.Sequencing.was.Reported|Oncotree.Code|Sample.Type|Sequence.Assay.ID|Cancer.Type|Cancer.Type.Detailed|Sample.Type.Detailed GENIE-MSK-P-0007415|GENIE-MSK-P-0007415-T03-IM6|27|DIFG|Metastasis|MSK-IMPACT468|Glioma|Diffuse Glioma|Metastasis site unspecified GENIE-MSK-P-0016566|GENIE-MSK-P-0016566-T01-IM6|30|ASTR|Metastasis|MSK-IMPACT468|Glioma|Astrocytoma|Metastasis site unspecified GENIE-MSK-P-0013544|GENIE-MSK-P-0013544-T02-IM6|48|HGGNOS|Metastasis|MSK-IMPACT468|Glioma|High-Grade Glioma, NOS|Metastasis site unspecified GENIE-MSK-P-0013836|GENIE-MSK-P-0013836-T01-IM5|56|GSARC|Metastasis|MSK-IMPACT410|Glioma|Gliosarcoma|Metastasis site unspecified GENIE-MSK-P-0013506|GENIE-MSK-P-0013506-T01-IM5|56|GBM|Metastasis|MSK-IMPACT410|Glioma|Glioblastoma Multiforme|Metastasis site unspecified GENIE-MSK-P-0018295|GENIE-MSK-P-0018295-T01-IM6|52|GB|Metastasis|MSK-IMPACT468|Glioma|Glioblastoma|Metastasis site unspecified GENIE-MSK-P-0019497|GENIE-MSK-P-0019497-T01-IM6|58|ASTR|Metastasis|MSK-IMPACT468|Glioma|Astrocytoma|Metastasis site unspecified GENIE-MSK-P-0042119|GENIE-MSK-P-0042119-T01-IM6|56|GBM|Metastasis|MSK-IMPACT468|Glioma|Glioblastoma Multiforme|Recurrence GENIE-MSK-P-0000378|GENIE-MSK-P-0000378-T01-IM3|55|GBM|Primary|MSK-IMPACT341|Glioma|Glioblastoma Multiforme|Primary GENIE-MSK-P-0000653|GENIE-MSK-P-0000653-T01-IM3|64|GSARC|Primary|MSK-IMPACT341|Glioma|Gliosarcoma|Primary GENIE-MSK-P-0000657|GENIE-MSK-P-0000657-T02-IM5|55|GBM|Primary|MSK-IMPACT410|Glioma|Glioblastoma Multiforme|Primary GENIE-MSK-P-0000677|GENIE-MSK-P-0000677-T01-IM3|47|GBM|Primary|MSK-IMPACT341|Glioma|Glioblastoma Multiforme|Primary GENIE-MSK-P-0054997|GENIE-MSK-P-0054997-T01-IM6|84|GBM|Primary|MSK-IMPACT468|Glioma|Glioblastoma Multiforme|Primary GENIE-MSK-P-0052220|GENIE-MSK-P-0052220-T01-IM6|64|GBM|Primary|MSK-IMPACT468|Glioma|Glioblastoma Multiforme|Primary
Dear @hghosh1, We discussed your inquiry with our external collaborators at MSK and they still maintain "that the current system in place at MSK does not distinguish between primary and recurrent GBM tumor samples." @kundrar, if there can be more clarification to assist with this inquiry, please feel free to comment. Best, Chelsea
Hello, I compared the MSK clinical data from cBioportal (study ID msk_2019) with that from GENIE (data_clinical_sample.txt). msk_2019 and GENIE share at least ~800 samples (am I correct to say that GENIE extracted these samples from this study specifically?) . There are discrepancies between the sample type columns in GENIE and msk_2019. At least 200 samples are labeled primary in GENIE but are recurrent in the MSK data set. Additionally, at least 27 samples are labeled metastatic in GENIE but are either primary or recurrent in the MSK data set. The sample type data from the MSK dataset makes more sense clinically, as I've mentioned before! Can you explain why these records are different? And, in light of what I've mentioned, can you once again confirm that MSK doesn't distinguish between primary and recurrent gliomas?
Hello @hghosh1, We can confirm that the current system in place at MSK does not distinguish between primary and recurrent GBM tumor samples. @kundrar feel free to add in more clarity regarding this inquiry. Please let us know if you have any further questions. Best, Chelsea
Hi @hghosh1, Thank you for your patience. The MSK team is currently working on this and they will try to get back to you as soon as they can. Looping in @kundrar for visibility. Best, Chelsea
Hello, any updates?
Hi @hghosh1 , Apologies for the delay in response. I have contacted our internal collaborators about this. I will let you know when I get a response. Best, Tom
Hi, any updates? If I've missed something, please let me know. I checked the data guide a few more times, and I don't think I've missed anything
Hello, just checking in! Have I missed something in the data guide about this situation?

Multiple Primary GBM in GENIE MSK page is loading…