Anndata.concatenate() with two 10x multiome datasets?

mkarikom · December 9, 2022, 8:39pm

I have a 10x Multiome data set (GSE199994) and ran scvi.data.read_10x_multiome() on each of 10 batches (1 batch per patient).

But when I try adataconcat = adata1.concatenate(adata2), the issue seems to be that there are no shared peaks between any of the batches:

For instance, adata1.var might contain 70,000 peaks with names like chr11:32333163-32334040, but none of these exist in any of the other patients.

gtca · December 13, 2022, 10:11pm

Hi @mkarikom, I think concatenation here is only defined for the same feature sets. In this case it seems the peaks were called separately on different samples.

See a discussion on a seemingly similar topic in the MuData repository here for some more details.

mkarikom · December 29, 2022, 6:10pm

Thanks @gtca, I ended up re-using the 10x cell-calling and reducing the peaks in signac using the raw ATAC data, then substituting these features for the ones generated by the per-batch peak-calling. In this case, the cell-calling previously performed for each batch was recycled.

Topic		Replies	Views
Scvi.data.organize_multiome_anndatas with two big anndata objects scvi-tools integration , scvi , multivi	1	151	May 3, 2024
How to concatenate anndata properly? anndata scrna-seq , integration , scvi	2	8182	November 3, 2022
How to concatenate spatial AnnData objects squidpy	4	1451	August 15, 2023
[suggestion] what would be the appropriate pipeline to perform joint embedding of GEX and ATAC? Multiome integration , multivi , totalvi	1	62	October 17, 2024
Help concatenating var for cite seq scanpy	2	619	May 24, 2023

Anndata.concatenate() with two 10x multiome datasets?

Related topics