Scvi for developmental data

yotamcons · November 20, 2023, 3:02pm

Hey there,
I want to fit a scvi model for developmental data, meaning the different batches are from different timepoints, and thus are different in many ways. specifically, i intend to use this for the Solo de-dubletting, but my question is also general.

Would it be correct to indicate the different timepoints as batchs?

martinkim0 · November 20, 2023, 8:54pm

Hi, thanks for your question. scVI wasn’t explicitly designed for integration across different timepoints, so your mileage may vary when trying to fit the model to developmental data.

I would give it a try and see how well it performs. You can pass in the timepoints with either batch_key or continuous_covariate_keys in SCVI.setup_anndata, depending on whether you want the model to treat it categorically (one-hot) or continuously.

cane11 · November 28, 2023, 6:30pm

It sounds appropriate for decipher. https://www.biorxiv.org/content/10.1101/2023.11.11.566719v1 It assumes correlated latent factors which is what you want to have for time series data and doesn’t try batch correction but embedding in a very low dimensional space (2D).
Generally, I wouldn’t try to correct for the batch in this setup.

yotamcons · November 29, 2023, 3:21pm

Thanks @cane11, looks like an interesting method. I think it doesn’t fit my needs (doublet removal in the embedding), but the idea is intriguing .

cane11 · November 29, 2023, 8:27pm

For doublet removal, just run solo on each batch seperately and train an scVI model with or without batch information (it doesn’t really matter there). You can then run solo for each batch seperately, like:

batches = pd.unique(rna.obs[batch_key])
is_solo_singlet = np.ones((rna.n_obs,), dtype=bool)
for batch in batches:
  logger.add_to_log("Running solo on batch {}...".format(batch))
  solo_batch = scvi.external.SOLO.from_scvi_model(scvi_model, restrict_to_batch=batch)
  solo_batch.train(max_epochs=configs["solo_max_epochs"])
  is_solo_singlet[(rna.obs["batch"] == batch).values] = solo_batch.predict(soft=False) == "singlet"
rna.obs["is_solo_singlet"] = is_solo_singlet

Topic		Replies	Views
Integration of timelapse scRNA data scvi-tools integration , scvi	3	467	October 25, 2022
Batch-Specific Training for Doublet Removal Help scvi , solo	0	428	October 4, 2022
Time point specific batch correction scvi-tools	5	669	April 3, 2022
SOLO usage - batch, training, predicting scvi-tools	8	1263	May 8, 2024
SOLO - channel vs batch scvi-tools solo , doublets	7	561	July 27, 2021

Scvi for developmental data

Related topics