How to concatenate andata objects when you have citeseq data

bioinf · December 30, 2024, 9:04pm

Hello,

I have .h5 files that contain gene expression and cite seq data. I am trying to confirm that the way I imported them seems correct:

Sample1 = sc.read_10x_h5(‘E:\Sample1_filtered_feature_bc_matrix.h5’, gex_only=False)
Sample1 .var_names_make_unique()

Sample2 = sc.read_10x_h5(‘E:\Sample2_filtered_feature_bc_matrix.h5’, gex_only=False)
Sample2 .var_names_make_unique()

Sample3 = sc.read_10x_h5(‘E:\Sample3_filtered_feature_bc_matrix.h5’, gex_only=False)
Sample3 .var_names_make_unique()

Sample1.obs[‘sample’]="Sample1 "
Sample2.obs[‘sample’]="Sample2 "
Sample3.obs[‘sample’]="Sample3 "

merge into one object.

adata_merged = Sample1 .concatenate(Sample2, Sample3, join=‘outer’)
adata_merged.obs_names_make_unique()

The adata_merged object seems ok and I can see the cite seq in adata_merged.var but I just want to make sure that it was done properly.

Thank you!

cane11 · January 14, 2025, 6:16pm

Yes looks correct. Calling var_names_make_unique seperately can have unexpected side effects, if you have data with different var_names.

bioinf · January 21, 2025, 3:07pm

Thank you for replying. What side effects should i look into? The .var from the gene expression is the same but i have different hashtags in some of the samples so there are a few different names.

cane11 · January 24, 2025, 9:56pm

If the order is different of genes in different datasets, you can end up with different names for the same gene and same name for different genes (it attaches -1 etc to the gene names).

bioinf · January 24, 2025, 10:20pm

Thank you for replying. Is there a better way to concatenate then?
I have not noticed -1 in the gene names. I do notice that it creates a gene ids-0, gene ids-1 etc in adata.var.
I see that on the barcodes it adds a -0, -1 etc at the end of the barcodes. Would that create an issue?

cane11 · January 29, 2025, 4:29pm

More consistent is adding counts of genes with the same name or string concatenating gene symbol and ENSGID.

Topic		Replies	Views
Help concatenating var for cite seq scanpy	2	620	May 24, 2023
How to concatenate anndata properly? anndata scrna-seq , integration , scvi	2	8218	November 3, 2022
Scanpy.pl.rank_genes_groups_dotplot doesn't plot gene symbols in adata.var_names scanpy	0	114	September 30, 2024
CITEsq loading in RNA and ADT data scanpy	4	845	July 6, 2023
3 sets of CITE-Seq data, how to concatenate or make single file for further analysis? muon integration , scanvi , diff-exp , multivi , totalvi	0	462	February 7, 2023

How to concatenate andata objects when you have citeseq data

merge into one object.

Related topics