Question on handling sc.pp.regress_out() in scanpy

Hi everyone!

I am a bioinformatics student fairly new to the scanpy universe and I have a question regarding the sc.pp.regress_out() function.
I scowered the interned for answers and I thought I might as well try here.
I am wondering why scanpy’s pbmc3k tutorial (and many similar ones) use ‘total_counts’ as well as ‘pct_counts_mt’ when regressing out data. Why not just use ‘pct_counts_mt’? Or any other specific unwanted variation in your dataset (cell cycle, for example). Will regressing out total_counts not affect all of the dataset, or am I looking at this all wrong?

Me and my supervisor are grateful for your reply! :slight_smile:

2 Likes