DE between conditions in a cluster

I have a dataset where cells from multiple conditions can be present in a given cluster, e.g. old and young donors. I want to run a DE analysis between age groups in, say, T cells. I was thinking to create a new metadata column which would be a combination of cell type(e.g. B or T cells) and age. So in the differential expression comparisons I’d do:

model.differential_expression(
    groupby="cell_type_age",
    group1="T_young",
    group2="T_old"
)

Does this make any sense? Maybe there’s a better way to do it?

Also, is there a way to use SCVI batch-corrected counts in other DE models?

Thanks in advance!

I have kind of same question,

Is there a direct way in the arguments of model.differential_expression to tell that we want differential expression for 2 groups (here Young vs old, and subset it to a certain leiden group (CD4 T cells, CD8 T cells etc).

Or do we have first to filter our anndata to only have CD4 T cells, run differential expression with group_by (young/old)? Then do the same for CD8 T cells?

It would be nice to have this feature directly in the argument (select 2 groups to compare , and do the comparison for given cell types etc…).

It is strange that this type of comparisons are not yet included because most single cell expression data are generally using different groups (Control vs KO, Young vs aged, control vs diabetic, etc) and with multiple cell types .

To be fair, the way this implemented in Seurat, you either subset a group and then run DE, or you create a new column as I describe. I’m just not sure if this also translates to scVI.

Yes doing a new column is a good idea

I think it would be great to have multiple levels of groupby’s in scVI. One groupby for cell populations to do DE in (e.g. cell types) and one groupby for things to test per cell population (treatment, disease status, etc.)

/Valentine