What is the easiest way to find here either microarrays or RNA-Seq summary (gene level) of schizophrenia data (including clinical data)?
Created by Assif Yitzhaky assify @hagenaue-
I don't think we have any ready made tables with cause of death, but our Nature Scientific Data paper can help you with sample sizes by diagnosis: https://www.nature.com/articles/s41597-019-0183-6. This is a shot in the dark - but has anyone outputted a simple cross-table of diagnosis vs. cause of death that could be viewed by individuals without controlled access?
I'm trying to get a sense of the sample sizes and diagnosis balance before jumping through the hoops for deeper access.
Thank you! I don't know off the top of my head what methods were used in PsychEncode. Do you know whether these values are equivalent to any of the columns in *Quant.genes.results files from PsychENCODE? Per the description in the file wiki: "This is a text file containing a matrix of gene counts for each sample, with samples as columns and genes as rows. The count values are generated by HTSeq. See the [processing code](https://www.synapse.org/#!Synapse:syn3346718) for specific parameter settings used." Thanks. And what are the values in the file syn3346749? Are they "expected counts" like in syn17894685 or something else? The sample ID key mapping between identifiers is in syn18080588. How can I find the clinical data for syn3346749 from CommonMind? I searched in the clinical file syn3354385 , but I could not find the corresponding sample names . This is the format of the existing data, but we will consider generating data in this format from our common pipelines in the future. Thank you. Is there an option to get an expression matrix in which each column is a single sample, like the files *geneExpressionRaw* in CommonMind? The Capstone [paper supplement](https://science.sciencemag.org/content/sci/suppl/2018/12/12/362.6420.eaat8464.DC1/aat8464-Wang-SM.pdf) describes the resource in more detail. Expected counts, TPM normalized and FPKM normalized counts are provided. Regrading the sample-specific files ending with "Quant.genes.results". Where can I find an explanation for each column? Which column corresponds to gene level count? Solly is correct although I recommend [this query](https://www.synapse.org/#!Synapse:syn8466658/tables/query/eyJzcWwiOiJTRUxFQ1QgKiBGUk9NIHN5bjg0NjY2NTggV0hFUkUgKCAoIFwiYXNzYXlcIiA9ICdybmFTZXEnICkpIEFORCBcImRhdGFUeXBlXCIgPSAnZ2VuZUV4cHJlc3Npb24nIE9SIFwiZGF0YVR5cGVcIj0gJ2lzb2Zvcm1FeHByZXNzaW9uJyIsICJpbmNsdWRlRW50aXR5RXRhZyI6dHJ1ZSwgImlzQ29uc2lzdGVudCI6dHJ1ZSwgIm9mZnNldCI6MCwgImxpbWl0IjoyNX0=) on the _dataType_ to find gene counts. Details for what is available and how data was processed in the Capstone collection can be found [here](https://www.synapse.org/#!Synapse:syn12080241). Data was run through a common pipeline.
My mistake - the previous table takes you to available RNASeq bam and fastq files through PsychENCODE while summary level data is available through the Capstone Project. I believe those were generated through the Capstone Project, which can be [queried here](https://www.synapse.org/#!Synapse:syn8466658/tables/query/eyJzcWwiOiJTRUxFQ1QgKiBGUk9NIHN5bjg0NjY2NTgiLCAic2VsZWN0ZWRGYWNldHMiOlt7ImNvbmNyZXRlVHlwZSI6Im9yZy5zYWdlYmlvbmV0d29ya3MucmVwby5tb2RlbC50YWJsZS5GYWNldENvbHVtblZhbHVlc1JlcXVlc3QiLCAiY29sdW1uTmFtZSI6ImFzc2F5IiwgImZhY2V0VmFsdWVzIjpbInJuYVNlcSJdfSx7ImNvbmNyZXRlVHlwZSI6Im9yZy5zYWdlYmlvbmV0d29ya3MucmVwby5tb2RlbC50YWJsZS5GYWNldENvbHVtblZhbHVlc1JlcXVlc3QiLCAiY29sdW1uTmFtZSI6ImZpbGVGb3JtYXQiLCAiZmFjZXRWYWx1ZXMiOlsidHN2Il19LHsiY29uY3JldGVUeXBlIjoib3JnLnNhZ2ViaW9uZXR3b3Jrcy5yZXBvLm1vZGVsLnRhYmxlLkZhY2V0Q29sdW1uVmFsdWVzUmVxdWVzdCIsICJjb2x1bW5OYW1lIjoic3R1ZHkiLCAiZmFjZXRWYWx1ZXMiOltdfV0sICJpbmNsdWRlRW50aXR5RXRhZyI6dHJ1ZSwgImlzQ29uc2lzdGVudCI6dHJ1ZSwgIm9mZnNldCI6MCwgImxpbWl0IjoyNX0=). Each sample is stored in a separate file as far as I can tell. @kelsey may be able to provide more details. Where can I find in this table RNAseq individual level gene counts (quantitated) data? [This table](https://www.synapse.org/#!Synapse:syn4921369/wiki/390659) delineates data available through PsychENCODE. Diagnoses are noted in each study description. @amap or @kelsey will be able to help you PsychEncode better than I can. Thank you. How do I perform a similar search in PsychENCODE too? Is there a similar table? [This table](https://www.synapse.org/#!Synapse:syn2759792/wiki/197295) can help you find what you're looking for. RNAseq data is available aligned to two different genome builds, so note the one you want. Gene counts are referred to as "quantitated" RNAseq data.
I mean individual level gene counts @assify
We have not generated microarrays on these data. By "RNA-Seq summary of schizophrenia data" do you mean individual level gene counts? Or do you mean differential expression statistics?
Solly