Hello, Can you make a public version of the GENIE germline filtering code for tumor-only calls (excluding anything that is sensitive information)? I'd like to apply the same process to another institution's data. I have started to recreate it based on page 19 of GENIE v11 data guide, but I'm not sure how tedious it may be. If you can't make the code available, are you able to talk about the specifics of each step as I recreate it?

Created by Hia Ghosh hghosh1
Additionally, can you point out where "stagingToCbio" function is called/where the databaseSynIdMappingDf is created? databaseSynIdMappingDf is a parameter of stagingToCbio, which is defined in the script you pointed out, but I cannot find where it is called. I would like to find the cis mutations that were filtered out.
Also, can you elaborate on why GENIE uses both ExAC and gnomAD? I know there are a few variants in ExAC that are not in gnomAD, but generally speaking, they are supposed to overlap.
Hi Haley, pip install -r requirements.txt is exactly when pyranges is loaded in, and when the issue with microsoft visual build tools must be installed. Can you tell me which tools in microsoft visual build tools I must select?
Hi @hghosh1 , Please see the associated [README](https://github.com/Sage-Bionetworks/Genie/blob/develop/README.md) for the list of dependencies. Also copied below: - Python 3.7 or higher - pip install -r requirements.txt - bedtools Best, Haley
Hi Haley! I'm downloading all the dependencies etc to run your scripts (starting with requirements.txt). One of the packages, pyrle, depends on Microsoft Visual C++ 14.0 (or greater)-- I've downloaded the Microsoft Visual C++ Build Tools, but which packages/tools does the GENIE code use? I have to manually check off the relevant tools.
Hello @hghosh1 , Thank you for your question. The [GENIE](https://github.com/Sage-Bionetworks/Genie) processing pipeline is available on GitHub and the germline variant filtering occurs [here](https://github.com/Sage-Bionetworks/Genie/blob/cc13197234374320574b7402cea4318b2e952a78/genie/database_to_staging.py#L323). Please let us know if you have specific questions. Best, Haley

Replicating GENIE's germline filtering process page is loading…