How to add additional metadata on multiple different single cell RNA files from different experiments, such as organ, sample number etc

I am handling a number of cell ranger output files from different experiments and integrating them to together.

I think Integration is just reading all the files in a for loop and concatenating them. But the issue I am having is adding metadata of the experiments? like type of organ, etc? Is there a way to add those accurately once the data is read using sc.read_10x_mtx

Hey @Echoo-ranger,

Could you add this information as columns of the obs dataframe before concatenation? If there is only one condition per cell ranger output, you could also add this information during concatenation with the label argument. Then the code should probably look something like:

import scanpy as sc
import anndata as ad

# Read in each experiment
experiment_dirs = ["organ1", "organ2", ...]
experiment_adatas = {
    expr_name: sc.read_10x_h5(f"path/to/{expr_name}/outs/filtered_feature_bc_matrix.h5")
    for expr_name in experiment_dirs

# Concatenate them, adding an indicator column in `.obs` for experiements
combined = ad.concat(experiment_adatas, label="experiment", merge="unique")

I think it is likely you will end up having to perform some kind of batch integration, or at least batch specific normalization to account for sequencing depth.