Diffmap DPT question

Hello,
I am following the Paul15 data tutorial here to create a diffusion map and pseudotime for my data:

And this tutorial:
https://scanpy-tutorials.readthedocs.io/en/latest/paga-paul15.html

However, my diffusion maps look very strange, the dots on my diffmap are very spread out for some reason. I wonder if there is something I am not doing right?

My code is below:

import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import pandas as pd

import scanpy as sc

sc.set_figure_params(figsize=(6,6))
sc.settings.figdir = "/home/figures"

adata = sc.read("Lymphocytes-CD8.h5ad")

sc.pp.recipe_zheng17(adata)
sc.pp.neighbors(adata, n_neighbors=20, use_rep='X', method='gauss')
sc.tl.diffmap(adata)
sc.pl.diffmap(adata, color='int_labels',save=".pdf")

adata.uns['iroot'] = np.flatnonzero(adata.obs['int_labels']  == 'Tissue Resident Memory')[0]

sc.tl.dpt(adata, n_branchings=1, n_dcs=10)
sc.pl.diffmap(adata, color=['dpt_pseudotime', 'dpt_groups', 'my_labels'])

Thanks for your help - I am new to scanpy so appreciate any insight!
s2hui

Hi,

I’m also new to scanpy and ran through the exact same tutorial with my data. My diffmap looks very similar to yours @s2hui1. The cells seemed to be lined along the edges of the diff map.

Have you been able to figure out why the diffmap looks that way, and whether it can be interpreted in a certain way?

Thanks!

Not sure at all if it is related, but I discovered that some values in my dpt result were Inf. When I removed these, and then plotted the diffmap, the diffmap looked much more “reasonable”.

I don’t know how to interpret the Inf values, and if that can just be ignored.

Hi @s2hui1

I think this happens because you’re performing DM on top of the expression data (which is correct), and not on top of PCA (which is a malpractice that has nevertheless been widespread throughout manuscripts due to poor peer-review).

What these results show you is that the learned components are not linearly correlated. I suggest you try coloring your UMAP plots with the values of the diffusion components (e.g. color by DC1, DC2 etc) to interpret these results. I also suggest you perform UMAP using the data matrix (sc.pp.neighbors with use_rep='X' prior to sc.tl.umap) to avoid linear bias due to PCA (again, a malpractice that sadly remains widespread).

To cut the long story short: I don’t think there’s nothing particularly wrong with your results. Other eigenfunctions of Laplace-type operators yield practically similar results (orthogonal components).