Convert Scanpy (h5ad) to Seurat (rds)

PEB · May 16, 2022, 10:29pm

Hi Everyone,

I am trying to convert my h5ad to a Seurat rds to run R-based pseudo time algorithms (monocle, slingshot, etc). However I keep running into errors on the commonly posted methods. Does anyone have any advice or experience on how to effectively read a scanpy h5ad in R?

Best,

peb

Justin_Hong · May 16, 2022, 10:51pm

I’ve had luck converting Seurat objects to AnnData objects in memory using the sceasy::convertFormat as demonstrated in our R tutorial here Integrating datasets with scVI in R - scvi-tools. You could try using this in the inverse direction using the from and to args.

P.S. Would be best to categorize this kind of question in the future under the “AnnData” tag.

PEB · May 16, 2022, 11:46pm

Hi @Justin_Hong , thank you for the tip! I found a lot of good information from the link you provided.

I have ran into a error though, have you seen this one before?

ad <- anndata::read_h5ad('Results/celltype_assigned_raw.h5ad')
sceasy::convertFormat(ad, from="anndata", to="seurat", outFile='file.rds')

Error in path.expand(inFile): invalid ‘path’ argument
Traceback:

sceasy::convertFormat(ad, from = “anndata”, to = “seurat”, outFile = “file.rds”)
func(obj, outFile = outFile, main_layer = main_layer, …)
path.expand(inFile)

R version 4.2.0 (2022-04-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] splines stats4 stats graphics grDevices utils datasets
[8] methods base

other attached packages:
[1] sp_1.4-7 SeuratObject_4.1.0
[3] Seurat_4.1.1 anndata_0.7.5.3
[5] sceasy_0.0.6 reticulate_1.25
[7] MAST_1.22.0 plyr_1.8.7
[9] clusterExperiment_2.16.0 gam_1.20.1
[11] foreach_1.5.2 monocle_2.24.0
[13] DDRTree_0.1.5 irlba_2.3.5
[15] VGAM_1.1-6 ggplot2_3.3.6
[17] Matrix_1.4-1 slingshot_2.4.0
[19] TrajectoryUtils_1.4.0 princurve_2.1.6
[21] RColorBrewer_1.1-3 scran_1.24.0
[23] scuttle_1.6.0 SingleCellExperiment_1.18.0
[25] SummarizedExperiment_1.26.1 Biobase_2.56.0
[27] GenomicRanges_1.48.0 GenomeInfoDb_1.32.2
[29] IRanges_2.30.0 S4Vectors_0.34.0
[31] BiocGenerics_0.42.0 MatrixGenerics_1.8.0
[33] matrixStats_0.62.0 jsonlite_1.8.0
[35] formatR_1.12

loaded via a namespace (and not attached):
[1] pbdZMQ_0.3-7 scattermore_0.8
[3] pkgmaker_0.32.2 tidyr_1.2.0
[5] bit64_4.0.5 DelayedArray_0.22.0
[7] rpart_4.1.16 data.table_1.14.2
[9] KEGGREST_1.36.0 RCurl_1.98-1.6
[11] doParallel_1.0.17 generics_0.1.2
[13] ScaledMatrix_1.4.0 leidenbase_0.1.11
[15] cowplot_1.1.1 RSQLite_2.2.14
[17] RANN_2.6.1 combinat_0.0-8
[19] future_1.25.0 bit_4.0.4
[21] phylobase_0.8.10 spatstat.data_2.2-0
[23] xml2_1.3.3 httpuv_1.6.5
[25] assertthat_0.2.1 viridis_0.6.2
[27] hms_1.1.1 evaluate_0.15
[29] promises_1.2.0.1 fansi_1.0.3
[31] progress_1.2.2 igraph_1.3.1
[33] DBI_1.1.2 htmlwidgets_1.5.4
[35] sparsesvd_0.2 spatstat.geom_2.4-0
[37] purrr_0.3.4 ellipsis_0.3.2
[39] dplyr_1.0.9 annotate_1.74.0
[41] gridBase_0.4-7 deldir_1.0-6
[43] locfdr_1.1-8 sparseMatrixStats_1.8.0
[45] vctrs_0.4.1 here_1.0.1
[47] ROCR_1.0-11 abind_1.4-5
[49] cachem_1.0.6 withr_2.5.0
[51] progressr_0.10.0 sctransform_0.3.3
[53] prettyunits_1.1.1 goftest_1.2-3
[55] softImpute_1.4-1 cluster_2.1.3
[57] ape_5.6-2 IRdisplay_1.1
[59] lazyeval_0.2.2 crayon_1.5.1
[61] genefilter_1.78.0 edgeR_3.38.1
[63] pkgconfig_2.0.3 slam_0.1-50
[65] nlme_3.1-157 rlang_1.0.2
[67] globals_0.15.0 lifecycle_1.0.1
[69] miniUI_0.1.1.1 registry_0.5-1
[71] rsvd_1.0.5 rprojroot_2.0.3
[73] polyclip_1.10-0 lmtest_0.9-40
[75] rngtools_1.5.2 IRkernel_1.3
[77] Rhdf5lib_1.18.0 zoo_1.8-10
[79] base64enc_0.1-3 ggridges_0.5.3
[81] pheatmap_1.0.12 png_0.1-7
[83] viridisLite_0.4.0 bitops_1.0-7
[85] rncl_0.8.6 KernSmooth_2.23-20
[87] rhdf5filters_1.8.0 Biostrings_2.64.0
[89] blob_1.2.3 DelayedMatrixStats_1.18.0
[91] stringr_1.4.0 zinbwave_1.18.0
[93] spatstat.random_2.2-0 parallelly_1.31.1
[95] beachmat_2.12.0 scales_1.2.0
[97] memoise_2.0.1 magrittr_2.0.3
[99] ica_1.0-2 howmany_0.3-1
[101] zlibbioc_1.42.0 compiler_4.2.0
[103] HSMMSingleCell_1.16.0 dqrng_0.3.0
[105] fitdistrplus_1.1-8 cli_3.3.0
[107] ade4_1.7-19 XVector_0.36.0
[109] listenv_0.8.0 patchwork_1.1.1
[111] pbapply_1.5-0 mgcv_1.8-40
[113] MASS_7.3-57 tidyselect_1.1.2
[115] stringi_1.7.6 BiocSingular_1.12.0
[117] locfit_1.5-9.5 ggrepel_0.9.1
[119] grid_4.2.0 tools_4.2.0
[121] future.apply_1.9.0 parallel_4.2.0
[123] uuid_1.1-0 bluster_1.6.0
[125] RNeXML_2.4.7 metapod_1.4.0
[127] gridExtra_2.3 Rtsne_0.16
[129] digest_0.6.29 rgeos_0.5-9
[131] shiny_1.7.1 qlcMatrix_0.9.7
[133] Rcpp_1.0.8.3 later_1.3.0
[135] RcppAnnoy_0.0.19 httr_1.4.3
[137] AnnotationDbi_1.58.0 kernlab_0.9-30
[139] colorspace_2.0-3 tensor_1.5
[141] XML_3.99-0.9 uwot_0.1.11
[143] statmod_1.4.36 spatstat.utils_2.3-1
[145] plotly_4.10.0 xtable_1.8-4
[147] R6_2.5.1 pillar_1.7.0
[149] htmltools_0.5.2 mime_0.12
[151] NMF_0.24.0 glue_1.6.2
[153] fastmap_1.1.0 BiocParallel_1.30.2
[155] BiocNeighbors_1.14.0 codetools_0.2-18
[157] utf8_1.2.2 spatstat.sparse_2.1-1
[159] lattice_0.20-45 tibble_3.1.7
[161] leiden_0.4.2 survival_3.3-1
[163] limma_3.52.1 repr_1.1.4
[165] docopt_0.7.1 fastICA_1.2-3
[167] munsell_0.5.0 rhdf5_2.40.0
[169] GenomeInfoDbData_1.2.8 iterators_1.0.14
[171] HDF5Array_1.24.0 reshape2_1.4.4
[173] gtable_0.3.0 spatstat.core_2.4-2

Justin_Hong · May 17, 2022, 12:04am

Looking at their code, looks like when converting from AnnData they require you pass in an input filepath rather than a loaded object. Seems like it’s because they want to ensure the anndata package is loaded correctly for their code to work.

Try doing this instead

ad_path <- "Results/celltype_assigned_raw.h5ad"
sceasy::convertFormat(ad_path, from="anndata", to="seurat", outFile="file.rds")

PEB · May 17, 2022, 12:43am

Hey @Justin_Hong, thank you for catching that! You’re right, it needed a string as a parameter instead of an object

Unfortunately it did not complete the conversion

ad_path <- "Results/celltype_assigned_hv.h5ad"
sceasy::convertFormat(ad_path, from="anndata", to="seurat", outFile="file.rds", use_seurat = FALSE, main_layer = "counts")

X → counts

Error in match(x, table, nomatch = 0L): ‘match’ requires vector arguments
Traceback:

sceasy::convertFormat(ad_path, from = “anndata”, to = “seurat”,
. outFile = “file.rds”, use_seurat = FALSE, main_layer = “counts”)
func(obj, outFile = outFile, main_layer = main_layer, …)
sapply(embed_names, function(x) reticulate::py_to_r(ad$obsm),
. simplify = FALSE, USE.NAMES = TRUE)
lapply(X = X, FUN = FUN, …)
FUN(X[[i]], …)
reticulate::py_to_r(ad$obsm)
ad$obsm
[.collections.abc.Mapping(ad$obsm, x)
name %in% x$keys()

I found a similar unresolved error on GitHub: Anndata to Seurat Object, Error in match · Issue #54 · cellgeni/sceasy · GitHub
I tried to gzip (as shared in Issue comments) but I can’t resolve the error as of yet.
Thanks again for your help!

gtca · June 2, 2022, 1:39pm

Hey @PEB,

You can also try using ReadH5AD() from MuDataSeurat.
There might be rough edges still but at least we can fix them quickly!

It is also a native R reader so no need for the Python environment and reticulate.

PEB · June 7, 2022, 12:42am

Hey @gtca,

Thanks for reaching out and suggesting MuDataSeurat!
Can you specify what kind of rough edges you’re referring to? Data loss?

I’ve been trying out MuDataSeurat and its been working pretty well. I do get some immediate errors.

Warning in read_layers_to_assay(h5) :
  Only a subset of mod//raw/X is loaded, variables (features) that are not present in mod//X are discarded.
Warning: Keys should be one or more alphanumeric characters followed by an underscore, setting key from rna to rna_
Warning: No columnames present in cell embeddings, setting to 'pca_1:50'
Warning: No columnames present in cell embeddings, setting to 'tsne_1:2'
Warning: No columnames present in cell embeddings, setting to 'umap_1:2'

However I do not think this impacted the process.
I am having some errors with using the data object in pseudo time. I am not too sure if its the converted matrix or data on the algorithm.

In any case, thanks MuSeurat is much better than the other methods I’ve been testing.

Best,

PEB

acaulier · December 6, 2023, 3:53pm

Hi @Justin_Hong,

I am trying to convert a rds object (JAN_039) into anndata object following the R tutorial you recommended: Integrating datasets with scVI in R — scvi-tools
But i keep getting an error from anndata package:

sceasy::convertFormat(JAN_039, from="seurat", to="anndata",
                       outFile='~/JAN_039_scenicplus/JAN_039.h5ad')

module 'anndata' has no attribute 'AnnData'Traceback:

1. sceasy::convertFormat(JAN_039, from = "seurat", to = "anndata", 
 .     outFile = "~/JAN_039_scenicplus/JAN_039.h5ad")
2. func(obj, outFile = outFile, main_layer = main_layer, ...)
3. anndata$AnnData
4. `$.python.builtin.module`(anndata, "AnnData")
5. `$.python.builtin.object`(x, name)
6. py_get_attr_or_item(x, name, TRUE)
7. py_get_attr(x, name)
8. py_get_attr_impl(x, name, silent)

Any idea of what is going wrong in here?
Thanks!

bsierieb1 · September 24, 2024, 8:57pm

the link to the tutorial is broken

Justin_Hong · September 27, 2024, 5:23pm

Sorry about that, here’s the updated link Using Python in R with reticulate — scvi-tools

Topic		Replies	Views
Read_10x_h5 Error scanpy	1	812	May 23, 2023
How to appropriately transfer annotations from anndata to seurat scvi-tools anndata	2	190	January 24, 2025
How to convert R matrix to anndata Help scvi , anndata	4	94	January 17, 2025
Convert anndata to seurat (spatial) anndata	1	83	May 9, 2025
Imaging data from seurat to scanpy Visium	1	929	October 4, 2023

Convert Scanpy (h5ad) to Seurat (rds)

Related topics