I am working with the dataset integrated with Seurat using CCA integration pipeline employing SCTranform. I do downstream analysis with a python-based tool and I need to extract highly variable genes from the integrated assay. For this, I wanted to use
sc.pp.highly_variable_genes(adata), however, the function expects log-scaled input (or raw counts if
flavor="seurat_v3"). Though provided integrated data are log-scaled they are also batch corrected which results in a significant proportion of negative values, thus, it does not fulfil the expected distribution.
My question is how to deal with this issue. Is it acceptable to provide negative values? Alternatively, should the values be floored at 0?