Questions about Cell2location mathematics model


I am a graduate student majored in mathematics from Harbin Institute of Technology of China. My research is focused on Bioinformation. I have recently read paper “Cell2location maps fine-grained cell types in spatial transcriptomics”. I am very interested in this article’s mathematics model and have studied it carefully. But I still have some questions about this paper after reading Supplementary Methods. I would be appreciated if you could give me answers or thoughts about these questions.
Question1: How exactly are so many levels of probability graph models constructed?
Question2: How should an ablation experiment with so many parameters be designed?
How can it be proved that the current model is optimal?
So why not add another layer of prior distribution to each hyperparameter of the graph?
Question3: If so, how are the specific optimizations implemented in code?


You can see the description in supplementary methods. Implementing the model of this complexity is made possible by probabilistic programming frameworks - - and by using automated Variational Inference to estimate posterior distributions.

Vary one parameter while keeping the rest of the model the same. Sometimes you need to test for interactions between features - but I can’t give generic advice here.

You can determine which changes will improve the model. You can’t really say the model is optimal - but you can say that it is sufficiently good for the data analysis problem you have at hand.

In some cases, this won’t make any difference. In some cases, such as cell abundance prior and detection_alpha - you want to achieve regularisation that encourages the model to learn cell abundance similar to the prior and to make normalisation model less flexible (respectively).

See tutorials for detailed explanations of how the probabilistic programming framework works.