Batch correction in reconstructed gene space


We recently have been applying scVI for batch correction to scRNAseq data from two different batches. We see the batch correction working when visualizing the data, however, when we do the genewise comparison between batches it looks like batch effect has not been removed. I wonder if we expect to see batch effect in the reconstructed gene space?


What do you mean by genewise comparison between batches? It sounds like you are comparing the reconstructed gene counts output by get_normalized_expression. If this is the case, by default it uses the original batch terms when inferring the normalized gene levels, which means it will reproduce the batch effects. If you would like a view of the gene counts where they are simulated to come from the same batch, you need to use the transform_batch kwarg, which when set to an integer computes the normalized expression rates as if they came from the same batch.

Hi Justin

Thanks for the quick response. I really appreciate it. Can you explain what exactly transform_batch doing? and how should we interpret the values we are getting from transform_batch.


And also documentation is a bit confusing and out of date. We tried to use the following link but it was broken:

Would be very helpful if there was a tutorial to get the corrected data with the up to date API.

I appreciate your work and wanted to give a feedback.


Hi the tutorial is here

If you ever have a broken link you can always revert back to

We have a page here that we will complete describing transform_batch