Inquiry about Data Input and DE Analysis Details in scVI

qweasdf1354 · April 29, 2024, 2:40am

Dear SCVI team,

I hope this message finds you well. I am currently utilizing scVI for my research and have some questions regarding the data input and analysis process, particularly concerning differential expression (DE) analysis. Your insights would be greatly helpful in advancing my understanding and application of the tool.

Data Input for DE Analysis: When conducting DE analysis using scVI, what specific type of data should be inputted into the model? Is it the raw count data, normalized data within the model, or another form of standardized data? I noticed that in tutorials, there is no explicit specification of the data input type, which has led to some uncertainty about what exactly is being used for DE analysis.
Interpretation of DE Results: Regarding the DE results, specifically when a mean_log2FC value is positive and exceeds 0.5, how should this be interpreted in terms of the comparison between two groups, say Group A and Group B? Does a positive value indicate that gene expression in Group A is greater than in Group B, or vice versa?
Inclusion of Batch Parameters: In community discussions, there is often debate over whether to include a ‘batch’ parameter in the model, and how setting it to True or False might affect the outcomes. Could you provide some guidance on when it is advisable to include this parameter and when it might not be necessary?
Data for Visualization: For visualizing markers or DE results, should we stick to the data form used in DE analysis or can we use normalized or scVI-normalized data?
Scanpy Integration: Regarding the integration with scanpy for DE analysis, is normalized data typically used in scanpy, while scVI might use a different form of data?

I apologize for the multitude of questions, but your expertise would greatly clarify these crucial aspects, enabling more accurate application and interpretation of scVI in my work.

Thank you very much for your time and assistance.

Best regards,

qweasdf1354 · April 29, 2024, 2:54am

I want to know whether should I do the normalize step to the raw count adata before i using the SCVI_DE model. In my understanding, the DE model will using the trained model which using the scvi-model.SCVI function initially?

martinkim0 · April 29, 2024, 4:03pm

I can address a subset of the questions:

Most of the models in scvi-tools (including scVI) require raw count data as input since the generative process parametrizes discrete distributions (negative binomial or Poisson). The generative process also learns normalized expression values (scvi.model.SCVI.get_normalized_expression) that are used for downstream differential expression. You can learn more about that in our user guide.
It’s recommended to set up a batch key using scvi.model.SCVI.setup_anndata when you expect there to be technical effects in your data (e.g. assay type), and this can help later for differential expression.

qweasdf1354 · April 30, 2024, 6:53am

Thank you for your reply!

cane11 · May 3, 2024, 8:38pm

Positive LFC means higher expression in group1 than group2.
I prefer validating and plotting results using raw data. This gives you additional insights about the actual measured differences.
Scanpy takes logarithmic values (after library size normalization) as input while scVI uses raw count data. The LFC is an arithmetic mean LFC in scVI and is the LFC of a geometric mean in scanpy. See The impact of package selection and versioning on single-cell RNA-seq analysis - PubMed for more details.

Topic		Replies	Views
Differential expression analysis scvi-tools	4	772	January 5, 2025
Compatibility between scVI and SCENIC scvi-tools integration , scvi	10	687	November 28, 2023
scvi.model.SCVI for DE gene anlaysis scvi-tools	1	700	May 8, 2022
DE analysis with model.SCVI: which lfc indicates gene up-/down-regulation? scvi-tools diff-exp , scvi	2	946	September 15, 2022
Differential expression with scvi - batch correction? scvi-tools scvi	1	283	June 19, 2024

Inquiry about Data Input and DE Analysis Details in scVI

Related topics