Visium normalization best practices

Are there any recommendations on normalizing spatial transcriptomics data before visualization? In the spatial scanpy tutorial, the gene expression is normalized like scRNA-seq data using normalize_total + log1p. In the squidpy visium tutorial, on the other hand, raw counts are plotted.

Personally I’m not convinced that normalize_total makes sense for spatial data, as

  1. I’d assume there is less technical variability between spots than between droplets.
  2. There are biological reasons for different spatial regions having different mRNA content (see also the example below).

But there should probably still be some normalization when comparing between samples. I’m wondering if something like scran makes sense here? As far as I understood scran takes different mRNA content of cells is taken into account when computing scaling factors. Alternatively maybe just normalizing at the sample level (e.g. normalize each sample to 10k * n_spots_under_tissue) is appropriate?


normalize total in a tumor sample

In my experience normalize_total seems problematic at least for the tumor slides I have, because different regions of the slide have vastely different mRNA contents. For instance, the tumor region has a much higher count density than the stromal or immune regions.
image
Fig1: log1p total counts


Fig2: spatial niches

When plotting the normalized gene expression, the normalized values appear to be much higher (although sparser) in the immune region, which is misleading:
image
Fig3: KRAS expression, log-normalized

Here is the same plot with just log1p transformed counts:
image
Fig4: KRAS expression, log1p-transformed

1 Like

Hi @grst ,

I was wondering the same thing.
While most publications do normalize_total this also seems a bit strange to me.
I saw Seurat’s sctransform and stLearn’s SME recommended here as they do not assume that every spot / cell should contain the same count.

Not sure how to best evaluate what works well and what doesn’t though. :man_shrugging:

The normalization from Cell2location using detection efficiency behaves well in my hands. For pure display reasons and if you don’t want to train something, another round of count normalization after logarithm is helpful in my hands as proposed by Lior Pachter (don’t remember where).

1 Like