How to appropriately transfer annotations from anndata to seurat

Hi,

I processed my rna seq data and annotated them using scvi, and I would like to integrate the data using R. Specifically, this is my anndata structure:

AnnData object with n_obs × n_vars = 16712 × 10494
    obs: 'sex', 'region', 'subcluster', 'cluster', 'nCount_RNA', 'nFeature_RNA', 'leiden', 'dataset', 'orig_ident', 'broad_names', 'simple_name', 'integrated_snn_res_1', 'integrated_snn_res_0_8', 'integrated_snn_res_1_2', 'broad_names2', '_scvi_batch', '_scvi_labels', 'cell type'
    var: 'ensembl_common_ID'
    obsm: 'X_pca', 'X_umap'
    layers: 'counts'

Which contains the two datasets I want to integrate (annotated under the obs ‘dataset’).
So I am importing this anndata in R using the following:

library(Seurat)
library(SeuratData)
library(SeuratDisk)
library(rhdf5)



# Load my data ------------------------------------------------------------


Convert("C:/Users/data/concatenated_filtered_CSR.h5ad", dest = "h5seurat", overwrite = TRUE)

# Load Seurat object
mouse_combined <- Connect("C:/Users/data/concatenated_filtered_CSR.h5seurat", mode = "r")
metadata <- h5read("C:/Users/data/concatenated_filtered_CSR.h5seurat", 
                   "/meta.data")

# Extract the counts data and cell_type
counts_data <- mouse_combined [["assays"]][["counts"]][["data"]]

# Extract the 'dataset' metadata from /meta.data
dataset_info <- mouse_combined [["meta.data"]][["dataset"]]


# Convert counts_data into a matrix
counts_matrix <- as.matrix(counts_data)

# Create a Seurat object
mouse_combined_seurat <- CreateSeuratObject(counts = counts_matrix)#, assay = "counts") #assay should default to R

# Extract cell names from Seurat object
# cell_names <- colnames(mouse_combined_seurat)
cell_names <- h5read("C:/Users/concatenated_filtered_CSR.h5seurat", 
                     "/cell.names")
colnames(mouse_combined_seurat ) <- cell_names


# Extract dataset and cell type info and ensure it's named according to cell names
dataset_info <- metadata[["dataset"]]
cell_type <- metadata[["cell type"]]
dataset_categories <- dataset_info$categories[dataset_info$codes + 1]
expanded_cell_type <- cell_type$categories[cell_type$codes + 1]


names(dataset_categories) <- cell_names  # Assign cell names as names of the vector
names(expanded_cell_type) <- cell_names

# Replace empty values in expanded_cell_type with "none"
expanded_cell_type[expanded_cell_type == ""] <- "none"

# Add dataset metadata to Seurat object
mouse_combined_seurat $dataset <- dataset_categories
mouse_combined_seurat $cell_type <- expanded_cell_type

As a sanity check, I want to map the annotations from one of the two datasets (contained in anndata obs as ‘cell type’ in one of the dataset) on the umap plot before the integration, however I get the label of cell type all over the plot (left, including in the wrong dataset). The plot on the right shows the label of the two datasets I want to integrate over in the umap plot. So I suspect something is wrong in the way I am importing the ‘cell type’ annotation:

Does anyone have any suggestion on the correct way of importing the obs layer in the seurat object?

Hi,
I would highly recommend against doing it manually. You can check out sceasy and zellkonverter for this purpose. As a sanity check though, I would recommend first doing a UMAP plot in Python using scanpy.

1 Like

Hi @cane11

Thank you so much, that worked perfectly!