Scvi-tools and xenium

wangjiawen2013 · August 15, 2024, 2:25am

Hi,
Can I use scvi-tools to process 10x genomics xenium spatial transcriptome dataset ? The following adata is an anndata object from xenium dataset.

# Registering the data
scvi.model.SCVI.setup_anndata(adata, layer="counts", batch_key="library_id")
# Creating and training a model
model = scvi.model.SCVI(adata, n_hidden=128, n_latent=30, n_layers=2, dispersion='gene')
model.train()

cane11 · August 18, 2024, 3:15pm

Hi, yes it works. However, this data contains less counts per cell than standard single-cell data and is more noisy (e.g. background). We will release hopefully latter this month an adapted model that works with image-based spatial transcriptomics and has an improved framework to model this data. In the meantime, I would recommend looking into ProSeg for segmentation.

wangjiawen2013 · August 19, 2024, 3:41am

I have tested scvi with a xenium dataset. when there is only one sample (~250,000 cells, ~100 genes), the umap and leiden clustering looks good. But when I integrated two xenium datasets with scvi, there are a lot of tiny clusters (~40 cells per cluster) except two major clusters ! This has never been occurred when intergrating multiple single cell datasets. So it looks there are many differences between xenium and single cell datasets, the data struct of xenium is not compatible well with scvi.

cane11 · August 19, 2024, 10:05am

These isolated islands are a problem with low count data and UMAP not so much with scVI. You can mitigate this by either stricter filtering to filter out low count cells. I assume your second sample has lower quality and if you run it seperately you will see similar behavior. You can also reduce this behavior by increasing n_neighbors before running UMAP. You get pretty similar behavior if you run scVI with very poor quality (low library complexity) single-cell data. The reason is just that in these cases 40 cells can have exactly the same expression values.

wangjiawen2013 · August 30, 2024, 8:29am

Thanks, the isolated islands was indeed caused by the low library complexity. In my case, the two datasets were both in good quality. However, the datasets were from different source, the genes became two few when integrating, becuase only the genes expressed in both datasets were kept, a lot of genes were filtered.

reinertanalytics · September 3, 2024, 4:48pm

Hi, thanks for a great package and good discussion here. An adapted model for image based spatial transcriptomics would be fantastic! Any ETA on release?

pgd · September 25, 2024, 10:49am

Hi there,
Great to hear there is other people trying to use scvi to integrate samples in a Xenium analysis. I am currently trying to make it work on 15 samples as well, but it isn’t going too well… Is there any update on this adapted model that works with image-based spatial transcriptomics?

Thanks a lot!
Best wishes

cane11 · March 13, 2025, 5:58pm

It is released and called resolVI. See the scVI-Tools docs.
I hope to also release a tutorial for Visium HD soon‘ish with a slightly modified approach by training a model on 2mum data and inference on 8mum data (pretty sure it can be optimized by using cell segmentation).

Topic		Replies	Views
Suggestions for training on xenium spatial transcriptomics dataset scvi-tools	5	147	March 22, 2025
Thoughts on a more ~realistic tutorial? scvi-tools tutorials	14	1329	February 26, 2022
Spatial datasets integration scvi-tools scvi	4	634	November 18, 2022
Dataset integration and analysis scvi-tools integration , cellassign , scvi , multivi , totalvi	3	850	May 3, 2023
Semi supervised integration update scvi-tools scanvi	2	378	May 3, 2022

Scvi-tools and xenium

Related topics