scRNA-seq batch correction for DestVI


Thanks for providing DestVI. I’m trying to understand if one can include batch correction in the scRNA-seq data processing for DestVI. The Methods section of the DestVI paper (Lopez et al. 2022) mentions that scVI conditions on the batch identifier, whereas scLVM conditions on the cell type information.
The DestVI tutorial, uses 4 lymph node scRNA-samples, which seem have to minimal batch effects between them–i.e. all 4 batches overlap very well with each other in the UMAP(see figure).

However, for the analysis that I’m running, I’d like to use scRNA-seq datasets from the same cancer type but from multiple studies, and I know that there are batch effects between studies. I can run scVI to correct the batch effects and get the embeddings from them. But my question is, is there a way to pass those embeddings to the DestVI pipeline? From the tutorial it wasn’t clear to me if CondVI does batch correction or nor. I see that CondVI uses a labels_key parameter for cell types, but not the batch ID, am I right? Is this what the authors mean by “scLVM conditions on the cell type information [but not on the batch identifier]”?



It’s not exactly clear how to set the batch_id for the spatial dataset in this setup (it uses the Decoder and requires setting a batch_id), which is why it wasn’t included in the original DestVI. I am looking into it currently and did some other changes to DestVI (GitHub - scverse/scvi-tools at can_destvi_v2). However, it is under development and I currently can neither guarantee a stable version nor guarantee model performance. I’ll try to update here, when we have a release candidate.

Hello Can,

Thanks for your response. I look forward for a release candidate when you have a chance.