Handling Non-Matching Categorical Labels in scArches

I’m using scArches to compare gene expression profiles between disease and normal samples. My query dataset contains cells from stem cells of healthy individuals and patients, labeled as WT, LSP2, and LSP3. The reference dataset includes samples labeled Sample-1, Sample-2, and Sample-3, derived from healthy brains.

The issue is that the categorical labels in my query and reference datasets do not directly match, and they represent fundamentally different conditions.

How should I handle and map these non-matching categorical labels to ensure compatibility for analysis in scArches?

Specifically:

  • Should I create a mapping that reflects the biological context of each sample?
  • Are there best practices for aligning categories when they represent different conditions or experimental setups?