Hi guys,
I know that scVI is primarily designed for modeling UMI-based single-cell RNA-seq counts, but I’m wondering whether it might also be applicable to bulk RNA-seq data from homogeneous cell lines. Since bulk RNA-seq typically uses full-length sequencing protocols, I was thinking it might be possible to preprocess it similarly to Smart-seq2 data.
After reviewing the scVI parameters (scVI — scvi-tools), I’m curious — if we consider bulk RNA-seq data as essentially the sum of a large number of identical cells, dropout events should be rare. In that case, would I need to adjust any specific parameters related to dropout modeling (e.g. disable zero inflation or tweak dispersion priors) to make scVI suitable for this type of data?
Would appreciate any thoughts or suggestions on this.