I trained SCVI and SCANVI models on a dataset. In order to test the probabilities output by soft labeling , I withheld one cluster from the model. The theory was that the model should classify cells in this cluster with low confidence (lower max probability). I did 10 runs of this exercise with all clusters except the ‘Unseen cluster’ as part of the training. Here is what I get:

In the left plot, I plot the fraction of cells with max probability below 0.95. Each of the 10 runs of the model is colored separately. As expected, this fraction is quite low (~2%) among training cells. Also, expectedly, the fraction is higher among cells from the ‘Unseen cluster.’ In the right plot, I am plotting the median and 25-75 percentile distribution for max probability for the 10 runs of the model for training cells and ‘Unseen’ cells.

What I find strange is the variation in the outcome of SCANVI.predict for 10 runs of the same training data. The variation is quite large. Is this a result of expected stochasticity in different model runs?