TotalVI probabilistic model

Atanas_Pavlov · November 3, 2021, 9:54pm

Hi, great work creating these ML models. I have recreated a subset of the TotalVI VAE model handling ADT data in Tensorflow. I the process I noticed that the distribution for the background rate is a function of the z latent variable. However I can not find any documentation for that in the TotalVI paper. There it is only stated that the prior (beta_nt) for that distribution is Lognormal with mean/sd for each batch X protein, but I don’t see any reference stating that the posterior is computed from z (while the functions computing the foreground scale and the mixing are explicitly described) So, should the background rate depend on Z, or should it only depend on the batch?

Cheers,
Atanas

adamgayoso · November 3, 2021, 11:29pm

Hi Atanas,

In equation 10 of the manuscript methods, we show the factorization of the approximate posterior. In this factorization indeed \beta depends on z. I’ll admit it’s likely confusing as to why this computation is in the generative function and not the inference function of the model.

In the loss method you will find the following term:

github.com

YosefLab/scvi-tools/blob/ac0c3e04fcc2772fdcf7de4de819db3af9465b6b/scvi/module/_totalvae.py#L605-L607

    
      
          kl_div_back_pro_full = kl(
              Normal(py_["back_alpha"], py_["back_beta"]), self.back_mean_prior
          )

which is the kl divergence between the approximate posterior and the prior.

adamgayoso · November 3, 2021, 11:31pm

I also encourage you to check out supplementary note 6 of the manuscript.

Atanas_Pavlov · November 4, 2021, 3:27pm

Thank you Adam for the quick and detailed answer!
Does this mean that the z-component encodes both information for the background as well as the foreground? Because ideally we would want it to encode purely information about the foreground or?
One more question. I am using the model to merge data from different batches. Right now the batch is one-hot encoded as an input, so if the number of batches changes, the network would have to be retrained. Have you thought about changing the architecture, so that the batch information is decoupled from the core model, and instead encoded as additional fixed number of continuous inputs (say equal to the number of proteins or 2 x number of proteins). Thus, if new batches arrive, then only this portion of the network would have to be retrained, while the core network could remain the same? Do you think that makes sense?

Atanas

adamgayoso · November 4, 2021, 10:12pm

Cell state can be informative of both foreground and background. Consider that knowing a cell is a B cells should be enough information to say that it has no CD4 abundance.

I encourage you to check out this user guide:

Topic		Replies	Views
Running TOTALVI data in which subset of cells do not have citeseq data scvi-tools integration , totalvi	8	634	March 25, 2021
Impact of batch on TotalVI results scvi-tools totalvi	1	72	January 5, 2025
totalVI NaN loss with few proteins scvi-tools	7	1070	August 3, 2022
Why totalvi can be used as denoising data? scvi-tools scvi	1	77	November 7, 2024
About VAE._regular_inference() scvi-tools	3	458	December 19, 2022

TotalVI probabilistic model

Related topics