How to find HVG in Seurat-integrated data?

pavsol · August 24, 2022, 5:24pm

Hi,
I am working with the dataset integrated with Seurat using CCA integration pipeline employing SCTranform. I do downstream analysis with a python-based tool and I need to extract highly variable genes from the integrated assay. For this, I wanted to use sc.pp.highly_variable_genes(adata), however, the function expects log-scaled input (or raw counts if flavor="seurat_v3"). Though provided integrated data are log-scaled they are also batch corrected which results in a significant proportion of negative values, thus, it does not fulfil the expected distribution.
My question is how to deal with this issue. Is it acceptable to provide negative values? Alternatively, should the values be floored at 0?

Thank you

Holly · July 27, 2023, 10:42pm

Hi, I have similar question. Did you got a solution to share?
Thank you

Topic		Replies	Views
Apply sc.pp.highly_variable_genes to SCT-normalized residuals or counts scanpy	0	260	July 27, 2023
Understanding scVI integration inside R with Seurat v5 & SCTransform scvi-tools integration	1	118	April 6, 2025
How to handle data lognormalization when using highly_variable_genes() with flavor seurat_v3? scanpy	3	1514	February 3, 2023
Why scanpy computes HVGs based on the data after log1p? scanpy	0	298	August 28, 2023
Error in highly variable gene selection scanpy scrna-seq , gene-selection	8	3866	March 21, 2022

How to find HVG in Seurat-integrated data?

Related topics