Hi, I am looking for a gene x sample matrix of copy number variation(CNV) called from WGS or WXS. So I can compare the CNV of some genes between primary and recurrent samples. May I know if that is something available for download? Thanks!

Created by Jingyi_Wu
Hi @Jingyi_Wu, Thank you for your interest in the resource and raising this request. I have uploaded an aliquot x gene table (which can be converted to a gene x aliquot table) under the Files tab (current release "variants_gene_copy_number.csv.gz") and added description of this new File to the Data Dictionary. The gene-level copy number calling approach was detailed in our 2019 paper (PMID:31748746) and I have provided that text below for convenience. A reminder that you will have clearer signals and more confidence in CNV calls if you restrict the aliquot barcodes to those listed as "allow" in the analysis_blocklist table. -Kevin -------------------------- **Copy number calling** A copy number caller loosely based on GATK ?CallCopyRatioSegments? (which in turn is based off of ReCapSeg) and GISTIC was implemented to call both arm-level and high-level copy number changes, respectively. Segments (from ?ModelSegments?) with a non-log2 copy ratio between 0.9 and 1.1 were determined to be neutral. These segments were then weighted by length and a weighted mean and standard deviation non-log2 copy ratio (once-filtered) were determined again. Outlier segments are removed and once again a weighted mean and standard deviation non-log2 copy ratio (twice-filtered) were determined. Segments with a non-log2 copy ratio between 0.9 and 1.1 and segments within two standard deviations of the twice-filtered mean were determined to be neutral, and segments outside of these boundaries were determined to have a low-level amplification or deletion, depending on the direction. The weighted mean and standard deviation of the non-log2 copy ratio (once-filtered) was then determined individually for each chromosome arm. Outlier segments were removed and the weighted mean and standard deviation of the non-log2 copy ratio (twice-filtered) was determined again. To determine a high-level amplification and deletion threshold, the most highly amplified and deleted chromosome arms were selected, respectively. The twice-filtered mean plus (high level amplification) or minus (high level deletion) two times the standard deviation of the selected arms were used as high-level thresholds. Gene level copy numbers were called by intersecting the gene boundaries with the segment intervals and by calculating the weighted nonlog2 copy ratio for that gene. The copy number call for that gene was then determined by comparing the gene-level non-log2 copy ratio to the previously determined thresholds.
I found the table cnv_data in figure1_3_input.RData but it only shows a few genes that are not what I am interested in. Btw, is there a way I can find the annotation for columns in cnv_data? Thanks!

copy number variation matrix page is loading…