Suggestion on parameters for training scvi model

martibonomi · December 4, 2023, 7:07pm

Thanks you very much for your fast reply! This helped me a lot.

I also have another question: I would like to integrate my samples but instead of using only the top 2000 highly variable genes, I would like to use all the genes.
However, the batches do not all have the same number of genes, so when creating the concatenated matrix I would do

anndata_dir = 'C:/Users/Martina/Desktop/CAR-T Atlas Data/AnnData'
list_files = os.listdir(anndata_dir)
anndata_list = []

for filename in list_files:
    file_path = os.path.join(anndata_dir, filename)
    anndata_obj = ad.read_h5ad(file_path)
    anndata_list.append(anndata_obj)

concatenated_anndata = ad.concat(anndata_list, axis=0, join='outer')

so that I can keep all the genes, and for those cells not having that genes I have 0 counts added from the ad.concat function.

I would like to do this in order to better integrate the data taking into account all the possible variability and then run the differential gene expression over all the genes to better characterise all the cells.

Would you recommend doing this? Does it remove batch effects efficiently? Or should I use a different approach? If so, which approach would you recommend?

What makes me doubt of this is when it comes to the differential gene expression: please correct me if I’m wrong, but I would think that by doing this, the cells from batches that do not express some genes, and thus have added ‘zeros’ from the ad.concat function through the join=‘outer’ setting, would get biased when calculating the DGE since I added these 0 counts (maybe they would be expressed but I don’t have that information).

Thank you so much for your help!!

Topic		Replies	Views
Insufficient batch correction for certain cell-types scvi-tools integration , scvi	8	444	May 15, 2024
scVI integration with all genes scvi-tools integration , scvi	0	290	December 5, 2023
Multiple training sets with scVI? scvi-tools integration , scvi	1	406	January 20, 2023
Shared cell types not mixing when integrating datasets from different species scvi-tools integration , scvi	4	73	June 19, 2025
Batch correction using scvI on multiple datasets + hyperparameter tuning of an scvI model scvi-tools integration , scvi , developer	1	242	February 16, 2024

Suggestion on parameters for training scvi model

Related topics