Working with Multiple samples with PEAKVI

MysoreSparrow · May 5, 2023, 2:31pm

Hello scverse community.

I am working with analysis of scATACseq dataset with 5 samples for 5 time points(d01, d05, d10, d25 and an Infected sample). I did the usual pre-processing for each sample (via signac) and I merged these 5 individual samples into a single seurat object :

atac_data ← Reduce(function(x, y) merge(x, y), list(d01, d05, d10, d25, Inf_dpi4)).

This merged object I converted to an anndata object :

adata ← convertFormat(atac_data, from = “seurat”, to = “anndata”, main_layer = “counts”, assay = “peaks”, drop_single_values = FALSE, outFile = “converted_object_all5_unfiltered.h5ad”)

and then used it with PEAKVI and obtain the UMAP.

Although I obtained the UMAP for this entire “merged” dataset, I would like to see the sample wise contribution to the UMAP - as an example please see the image below the image from ArchR.

Question 1: Is it correct that I merged the indivodual seurat objects into a single seurat object and then converted it or would it have been better if I had converted 5 seurat objects individually into anndata objects and then concatenated it - Would I then be able access sample-wise information in the UMAP?
Question2: Is it possible to obtain the sample-wise information for an UMAP obtained from PEAKVI at all or do I need to convert this PEAKVI-trained object back into a seurat object and then try to hopefully generate a UMAP?

I feel like I am missing a step/trick somewhere but I cant see by myself what it is.
If you could provide me the information and/or point me to a link or tutorial where I can find a solution then it would be very helpful to me. Thank you for your time.

martinkim0 · May 5, 2023, 8:06pm

Hi, thank you for your question. If you are referring to coloring your UMAP by each experimental sample, I think either of the methods you have listed would work. You would just have to include an observation-wise metadata column indicating which sample each observation originates from.

If you were to convert each seurat object to anndata first, you can just add an obs column to each anndata with the sample labels. Then you can pass in that column to your plotting function.

In Python, it might look something like this:

adata1.obs["sample"] = "d01"
adata2.obs["sample"] = "d05"
# same for other anndatas

adata_all_samples = anndata.concat([adata1, adata2, ...])
sc.pl.umap(adata_all_samples, color="sample")

Topic		Replies	Views
Issues setting up anndata for SCVI anndata integration , scvi	2	411	April 2, 2024
Nonsense UMAP when including categorical covariates in a MULTIVI model Help scvi , multivi	1	281	August 28, 2023
Option to plot separate umaps for concatenated datasets scanpy scvi	4	942	December 27, 2023
Weird UMAP after running scVI scvi-tools	6	243	September 12, 2024
Data preprocessing for scATAC-seq scATAC-seq scatac-seq	4	1163	September 10, 2021

Working with Multiple samples with PEAKVI

atac_data ← Reduce(function(x, y) merge(x, y), list(d01, d05, d10, d25, Inf_dpi4)).

adata ← convertFormat(atac_data, from = “seurat”, to = “anndata”, main_layer = “counts”, assay = “peaks”, drop_single_values = FALSE, outFile = “converted_object_all5_unfiltered.h5ad”)

Related topics