I tried using the seed labeling on the Tusi 2018 data and their marker genes from Supp table 1. I followed strictly your tutorial (https://docs.scvi-tools.org/en/stable/user_guide/notebooks/seed_labeling.html) and the first results look promising. However, I would like to get the posterior probability score which you used in your original publication (e.g. Figure 6 D).
On that note, is there a implemented way to dismiss labels if they have no support based on the trained model?
Is it possible to extract it from the model? I tried dir(scvi.model._scanvi.SCANVI) but could not find.
So I think you just want to add soft=True to the predict method of SCANVI. However, we do have a bug in this where it’s not correctly outputting a dataframe.
Hello Adam, many thanks for your reply! So I added the following lines:
y_pred = scanvi_model.predict(adata, soft=True)
pred = pd.DataFrame(data=y_pred[0:,0:])
pred_score = pred.max(axis=1).to_numpy()
I assume that pred_score is now the maximum score across all labels for each cell. Which should correspond to the label assigned to that cell.
Background: I annotated progenitor cells with SingleR and use the top 10 SingleR labels as seed labels with scanvi. That is the comparison of the scvi vs SingleR score.