Preparing data for training of a classifier while preserving biological signal

Hi, I want to train a classifier (probably something tree based) to determine sex based on expression of certain genes.
My training material are multiple datasets.

To prepare datasets I would like to remove batch effects while keeping sex-specifc differences. My batches and sex are confounded, as every sample was prepared separately.

If I run scvi will it also eliminate the differences I am interested in or can I do something to prevent this?

Thanks a lot!