Could you please provide the description of the files at https://console.cloud.google.com/storage/browser/dream-smc-rna/?
It's far from clear which files I should use and what they mean.
Created by Colin Barranco colinb Dear Roman and Keith,
Thanks for this question. We have updated the data description on the [challenge website](https://www.synapse.org/#!Synapse:syn2813589/wiki/401457) based on your comments. Please let us know if you have further questions.
Best,
Kristen Yes, exactly, thanks Keith.
Also, the paired reads orientation and insert size. I agree. It could just be a brief description of each file type like:
The fq.gz files contain the paired-end raw reads in fastq format, in this case simulated by [fusion simulator]?
The filtered.bedpe files contain the corresponding true fusion events, with 1-based coordinates indicating the fusion breakpoint?
The isoforms_truth.txt files contain the true normalized expression values for each ENSEMBL transcript?
I put question marks at the end of each of the above because I'm just guessing. Dear Thomas,
I've seen that page; it doesn't describe the data properly.
For instance, what are sim1_mergeSort_1.fq.gz and sim1_mergeSort_2.fq.gz? What is sim11_filtered.bedpe? How to interpret sim11_isoforms_truth.txt? Dear Roman,
Apologies for the delay in response. Here is the data description: https://www.synapse.org/#!Synapse:syn2813589/wiki/401457 These datasets can be found under `https://console.cloud.google.com/storage/browser/dream-smc-rna/training/` Please kindly let me know if there is still any confusion.
Best,
Thomas