Questions about Cell2location mathematics model

Hello

I am a graduate student majored in mathematics from Harbin Institute of Technology of China. My research is focused on Bioinformation. I have recently read paper “Cell2location maps fine-grained cell types in spatial transcriptomics”. I am very interested in this article’s mathematics model and have studied it carefully. But I still have some questions about this paper after reading Supplementary Methods. I would be appreciated if you could give me answers or thoughts about these questions.
Question1: How exactly are so many levels of probability graph models constructed?
Question2: How should an ablation experiment with so many parameters be designed?
How can it be proved that the current model is optimal?
So why not add another layer of prior distribution to each hyperparameter of the graph?
Question3: If so, how are the specific optimizations implemented in code?

Thanks