Hi, thank you for this great tool. I would like to deconvolve various cancer tissue spatial transcriptomic datasets which do not have matched scRNA-seq data. What are important considerations when finding other datasets, other than ensuring it is from the same tissue. Thanks!
Generally, same sequencing technology if possible so in most cases Poly-A capture and UMI based.
You see improvement when the reference contains fibroblast and endothelial cells and in your case cancer cells of similar type. Those are the cells missing in most scSEQ experiments. I had good results using smartSEQ2 or snSEQ for those celltypes and concatenate them with scSEQ for other celltypes.
To verify the results I would recommend to also run Cell2location and see that both results match.
As a final check: you can output the background gene expression st_model.module.eta and print the highest expressed genes and check in those for marker genes of expected cell types. This tells you that which celltype is still missing in your reference (is most of the times fibroblast and endothelial cells in my hands).
Thank you very much for your reply. So would I be able to use scRNA-seq data captured from another patient but same tissue e.g. breast cancer to deconvolve another patient spatial transcriptomic dataset?
Yes, that should be fine. If the cancer is quite heterogeneous it makes sense to include multiple patients as reference to capture the heterogeneity of tumor cells.