Hi, many thanks for your great tool!
I tried using the seed labeling on the Tusi 2018 data and their marker genes from Supp table 1. I followed strictly your tutorial (https://docs.scvi-tools.org/en/stable/user_guide/notebooks/seed_labeling.html) and the first results look promising. However, I would like to get the posterior probability score which you used in your original publication (e.g. Figure 6 D).
On that note, is there a implemented way to dismiss labels if they have no support based on the trained model?
Is it possible to extract it from the model? I tried dir(scvi.model._scanvi.SCANVI) but could not find.
So I think you just want to add
soft=True to the
predict method of
SCANVI. However, we do have a bug in this where it’s not correctly outputting a dataframe.
Hello Adam, many thanks for your reply! So I added the following lines:
y_pred = scanvi_model.predict(adata, soft=True)
pred = pd.DataFrame(data=y_pred[0:,0:])
pred_score = pred.max(axis=1).to_numpy()
I assume that pred_score is now the maximum score across all labels for each cell. Which should correspond to the label assigned to that cell.
Background: I annotated progenitor cells with SingleR and use the top 10 SingleR labels as seed labels with scanvi. That is the comparison of the scvi vs SingleR score.
Your process is correct. We just released the new version which fixes the issue with the soft prediction, so I recommend updating to it.
Great, many thanks for your tools and dedication!