Posterior probability interpretability (soft=True)

ondy · August 27, 2025, 9:53am

Hello,

I used scANVI to transfer cell type labels from a single-cell reference to my spatial transcriptomics data (MERFISH).

I assigned labels at three hierarchical levels and then compared the prediction scores obtained with the .predict(..., soft=True) function.

At the family level (e.g., ET, IT, GABA in the cortex), the scores are almost all close to 0.
At the supertype level (e.g., L4 ITs, Pvalbs, etc.), the results look much better, with most cells having scores close to 1.

I would have expected the family level—being more general—to yield stronger prediction scores than the more specific supertype level.

I searched GitHub issues and only found the following reference:

My questions are:

Can these scores be used for filtering?
Should I trust them as an indicator of assignment confidence?
Could you explain how the scores are actually calculated? I read the description in the paper, but I’m not sure I fully understood it.

Thanks a lot for your help!

ori-kron-wis · September 25, 2025, 12:46pm

Hey,

They can be used as filtering, but first you will have to prove for yourself that your model is good, so you will likely need to do some kind of validation for your model (like put some data aside, train the model, and check that the predicted labels on the left aside data are good, i.e., match the ground truth, with any kind of classification metrics)
As Martin suggested, they are not well calibrated so I cant say for 100% they are source of confidence (you can run a bootstrap to prove for yourself)
It uses a softmax layer in as the final activation at the end of the NN model, and converts the logits into the predicted probability to be assigned to any of the given class labels (probability distribution over all classes).

See more on SCANVI tutorials:

for example:

BTW,

I would suggest that you check out our new spatial model that might help in your case (see RESOLVI and scVIVA)

Topic		Replies	Views
Posterior probability of being assigned to a specific label scvi-tools scanvi	4	694	July 28, 2021
SCANVI soft labeling scvi-tools scanvi	1	985	August 8, 2022
Label transfer with SCVI-SCANVI pipeline changes (predicts wrong) labels in ref data scvi-tools scanvi , scvi	8	1182	July 31, 2023
Transferring lables from refrence dataset to query dataset with a more diverse cell population using scANVI scvi-tools scanvi	8	217	October 17, 2024
scANVI relables known cells with known types incorrectly scvi-tools scanvi	13	2053	April 18, 2023