Topic modelling or factorization on integrated scRNA-seq data?

dpcook · May 10, 2023, 7:26pm

Does anyone have any thoughts or advice on approaching topic modelling or factorization on scVI-integrated data? I have some collections of scRNA-seq data from various sources and scVI does a great job at producing a batch-corrected representation. Within some of the cell types, there’s more of a continuous gradation of expression. If it were a single dataset, I’d try something like NMF to look at these programs, but in this case I’d be concerned about batch-specific factors.

Is there a reasonably clean way to approach this? Eg. NMF on transform_batch-corrected counts? In papers, it’s not uncommon to see something like NMF on each sample independently and then look for similar factors, but this feels a little clunky.

I appreciate any advice!

Topic		Replies	Views
What model to use when integrating batches of scRNA-seq matrices containing >150,000 T and innate lymphoid cell (ILC) sub-populations scvi-tools scvi	7	740	May 26, 2022
Insufficient batch correction for certain cell-types scvi-tools integration , scvi	8	576	May 15, 2024
Scvi - denoising single-cell/single-nucleus transcription data scvi-tools scvi	3	339	August 8, 2024
How to Correct for Intra-Organ Batch Effects Without Removing Inter-Organ Differences? scvi-tools integration	6	112	August 5, 2025
Integration with scVI scvi-tools scvi	2	1054	November 30, 2022

Topic modelling or factorization on integrated scRNA-seq data?

Related topics