Batch correction on different conditions

Hello everyone,

I’ve often wondered if there is I should be running a batch correction algorithm when integrating datasets from different 10x genomics GEM wells that contain samples from the same type of mouse tissue, but different conditions (I.e. mouse lung from a wildtype mouse, and mouse lung from a knockout mouse). Any input would be appreciated.

Thanks!

I would argue there is no “should”. This question is so general that I don’t think it’s appropriate to give a recommendation but some thoughts.
You will lose differences between both conditions by integrating samples. However, for downstream tasks like cell-typing it is often helpful to perform integration. An estimate of the amount of non-biological batch effects is the variation between replicates (several mice with the same genotype and same experimental condition). If this non-biological batch effect is high, I would recommend to improve the experiment and otherwise (like clinical samples) perform batch integration and interpret the remaining difference.

1 Like