Hi @nrappapo , Reposting your question here on the community forum. ``` Hi, I'm looking at what I believe to be the raw counts from the Mayo clinic temporal cortex samples: Mayo_TCX_all_counts_matrix.txt, downloaded from here: https://www.synapse.org/#!Synapse:syn12104376 The first 4 lines are the STAR output for the nonspecific binding (unmapped, multimapping, noFeature, ambiguous) and then, if I understand correctly, the specific counts per gene for each sample. However, when I look at the counts, I am a little confused, as the numbers for the non specific binding are much higher. For example, for the first sample, 1005_TCX: N_unmapped 2675822 N_multimapping 17597859 N_noFeature 6860299 N_ambiguous 3245162 But if I sum the counts for all genes in this sample, I get 129670, which is quite a low percentage. As far as I read online, this doesn't look right, and I was wondering if you had any insight about it. Also, do you know what is the difference than how this file was obtained: https://www.synapse.org/#!Synapse:syn4650257 ``` For the first section of the question, looking at [syn8690799](https://www.synapse.org/#!Synapse:syn8690799). I get 40,185,486 counts (sum of rows 6:60730) for patient 1005_TCX. Do you mind sharing your code and confirming the file synID? For the second part, the provenance for [syn8690799](https://www.synapse.org/#!Synapse:syn8690799) indicates that the counts were combined with [syn9757876](https://www.synapse.org/#!Synapse:syn9757876) and the counts were tabulated across the BAM files with this shell script: [run star mayo](https://www.synapse.org/#!Synapse:syn9757879) the input BAM files are here: [syn4894912](https://www.synapse.org/#!Synapse:syn4894912). The file MayoRNAseq_RNAseq_TCX_geneCounts.tsv ([syn4650257](https://www.synapse.org/#!Synapse:syn4650257)) doesn't appear to have any provenance associated with it so I'm not how the counts were tabulated. It was created by @bheavner so perhaps they can point us in the right direction! best, Jake

Created by Jake Gockley jgockley
Hi, I am also wondering where the file MayoRNAseq_RNAseq_TCX_geneCounts.tsv (syn4650257) comes from and what is the difference wrt syn8690799. Is there any follow-up? Thanks Giulio
thank you Jake ! I found my bug (R newbie here...), so in case someone ever faces the same issue, the solution was adding "stringsAsFactors = FALSE" to read_table. When I didn't do that, the columns loaded as factors, and the conversion to numeric values messed things up. All good now !

Mayo RNA-Seq Counts page is loading…