Questions regarding alignment for SEA-AD multiome GEX data

Hello, I have aligned your mutiome snRNAseq MTG data. I am getting ~35% mapping. That is lower than I expected. What percentage mapping alignment did you get? Wondering if I did something wrong? THANKS

Created by Ashay Patel aopatel
We have all the cellranger/cellranger-arc/cellranger-atac outputs here on Synapse. Depending on what output you want you can filter by filetype on the AD Knowledge portal.
@ktravaglini do you already have previously processed and aligned multiome data?
I would recommend using cellranger-arc (we used v2.0.0, per the except from the manuscript I noted above).
@ktravaglini If the data that I want to look at in the multiome folder is strictly snRNA-seq, is it appropriate to coerce the data using regular cell ranger count? This is the message I get ``` Log message: Cell Ranger detected the chemistry ARC-v1, which may indicate a workflow error during sample preparation. Please check the reagents used to prepare this sample and contact 10x Genomics support for further assistance. If this workflow is intentional, you can force Cell Ranger to process this data by manually specifying the ARC-v1 chemistry in your analysis configuration. ```
That is not snRNAseq data, that's snMultiome data and should be aligned with cellranger-arc. -Kyle
@ktravaglini I am trying to align the MTG snRNAseq data from the multiome folder
No, cellranger v6.1.1 was used for the snRNAseq alignments (chemistry is 10x 3' v3.1). What file are you trying to align?
@ktravaglini Was cellranger-arc used for the snRNA-seq as well? Because when I try to use cell ranger count its giving me an error and saying arc chemistry is detected?
@aopatel From the manuscript: Reads from snRNA-seq libraries were mapped to 10x Genomics? official human reference (?Human reference (GRCh38) ? 2020-A?) and unique molecular identifiers (UMIs) counted per gene using the cellranger (version 6.1.1) pipeline with the ??include?introns? parameter included. Reads from snATAC-seq and snMultiome libraries were mapped to the same reference using cellranger-atac (version 2.0.0) and cellranger-arc (version 2.0.0) pipelines with default parameters, respectively. 10x has release notes on 2020-A (https://www.10xgenomics.com/support/software/cell-ranger/latest/release-notes/cr-reference-release-notes#2020-a)
@ktravaglini Which genome did you use?
Hi @aopatel, The vast majority of the libraries had >80% of their reads mapped uniquely to the genome with cellranger/STAR. A small subset of libraries from severely affected donors (described in Figure 1 of our manuscript) had alignment rates between 40-80% and 2 libraries from 2 donors that we failed (due to low RIN) were below 40%. I suspect the issue is single **nucleus** RNAseq will have less spliced transcriptomic reads (and I know kallisto builds a **transcriptomic** versus **genomic** reference). You will probably need to include introns in your transcriptomic reference. If you already have (I believe that's what nascent.txt is?), then it may be an issue with the source of the "nascent" transcripts itself (e.g. if it was built from non-brain tissue it may be missing nascent transcripts that are found exclusively in the brain). Best, Kyle
Yeah I wonder what I am doing wrong, I assume your tech for multiome is most similar to 10XV3 out of these: ``` name description on-list barcode umi cDNA ------------ ----------------------------------- ------- ----------------------- ------- ----------------------- 10XV1 10x version 1 yes 0,0,14 1,0,10 2,None,None 10XV2 10x version 2 yes 0,0,16 0,16,26 1,None,None 10XV3 10x version 3 yes 0,0,16 0,16,28 1,None,None 10XV3_ULTIMA 10x version 3 sequenced with Ultima yes 0,22,38 0,38,50 0,62,None BDWTA BD Rhapsody yes 0,0,9 0,21,30 0,43,52 0,52,60 1,None,None BULK Bulk (single or paired) 0,None,None 1,None,None CELSEQ CEL-Seq 0,0,8 0,8,12 1,None,None CELSEQ2 CEL-SEQ version 2 0,6,12 0,0,6 1,None,None DROPSEQ DropSeq 0,0,12 0,12,20 1,None,None INDROPSV1 inDrops version 1 0,0,11 0,30,38 0,42,48 1,None,None INDROPSV2 inDrops version 2 1,0,11 1,30,38 1,42,48 0,None,None INDROPSV3 inDrops version 3 yes 0,0,8 1,0,8 1,8,14 2,None,None SCRUBSEQ SCRB-Seq 0,0,6 0,6,16 1,None,None SMARTSEQ2 Smart-seq2 (single or paired) 0,None,None 1,None,None SMARTSEQ3 Smart-seq3 0,11,19 0,11,None 1,None,None SPLIT-SEQ SPLiT-seq 1,10,18 1,48,56 1,78,86 1,0,10 0,None,None STORMSEQ STORM-seq 1,0,8 0,None,None 1,14,None SURECELL SureCell for ddSEQ 0,0,6 0,21,27 0,42,48 0,51,59 1,None,None Visium 10x Visium yes 0,0,16 0,16,28 1,None,None ``` And I just trimmed using trim_galore: ``` trim_galore --quality 20 --fastqc --illumina --cores 9 --paired "$r1_file" "$r2_file" ```
Ah. I misunderstood what you meant by mapping. I'm pretty sure that our data had substantially higher mapping when we aligned to the genome, but I'll check and get back to you as soon as I hear back.
I used the Kallisto aligner with this command on your multiome-MTG snRNA-seq data that was previously trimmed with trim_galore: ``` kb count -x 10XV3 --workflow=nac -o output.dir -i index.idx -g t2g.txt -c1 cdna.txt -c2 nascent.txt --sum=total --batch-barcodes batch.txt --verbose --strand=unstranded ```
Hi @aopatel, Based on what you wrote here, I'd guess that either your data is low quality, or there is something wrong with one of the data/file formatting steps between cell x gene matrix and mapping results, but would need more information to know for sure. Did you use MapMyCells for this or map the data in a different way? If the former, can you please provide the ID associated with your run? It will look something like this: 1720204204941-b151e3ce-4dfe-446e-b965-5633009d277f. This would allow folks on our end to debug, if you are okay with that. Best, Jeremy
@aopatel I will contact some of the computational SEA-AD scientists to provide some guidance.
@eitan.kaplan Yes, please this would be helpful. Thank you,
Hi @aopatel, I'm not sure but we can ask the team that contributed the data. @eitan.kaplan, would you be able to help answer these questions about mapping for the SEA-AD data? Thanks!
Additionally the duplicates percentage is >50% for most samples?

Your web browser must have JavaScript enabled in order for this application to display correctly.
If you are an automated web crawler from a search engine, follow this AJAX application crawl link

Hello, I have aligned your mutiome snRNAseq MTG data. I am getting ~35% mapping. That is lower than I expected. What percentage mapping alignment did you get? Wondering if I did something wrong? THANKS

Drop files to upload

Questions regarding alignment for SEA-AD multiome GEX data page is loading…