Using scVI-integrated data for scVelo and CellRank analysis - what embeddings to supply for velocity?

Hi everyone,

I have been getting a bit confused as to which embeddings should be used to calculate neighbours and moments for scvelo. In short, I have 4 samples that have been library prep batch-corrected and harmonized with scVI. I then used palantir to generate diffusion maps (standard and multiscale) without MAGIC imputation, and I then drew UMAPs, TSNE and FA graphs using those embeddings.

When proceeding with scVelo, I am a bit confused whether I should supply scVI embeddings as they are corrected. I am also wondering if supplying any of the diffusion map embeddings would be a valid choice.

Finally, if I further decide to subset the data (e.g. leaving only half of the cell types for analysis), what kind of embeddings can I supply again since I assume the scVI ones would need to be recomputed somehow?

Please advise, if possible. Thank you for any help!

2 Likes

Hi, did you resolve this?
I have a similar question with scanorama embeddings- and using that for scVelo and CellRank
Thanks

Hi there. So based on what I’ve learned so far, I think that it is fine to supply any of the embeddings that make the most sense to you because all of them are valid (but may give different results). scANVI/scVI embeddings are great if your data had to be integrated. If not, TSNE or PCA-based embeddings are great. In my case, because my data had to also be ordered by age (so first creating an augmented affinity matrix and then using that for diffusion map calculation), I’ve ended up using TSNE embeddings that were calculated on the basis of the diffusion map-based multiscale space. I think it makes sense for my work. I’ve also tried other embeddings and ended up with similar results.

Whether or not that is the correct approach I am not 100% sure, so if the original package creators could comment instead that would be amazing.

Thank you! This was helpful