Interpreting decoupler results

I am using decoupler’s univariate linear model approach to infer transcription factor activity based on Collectri GRNs. When I run decoupler.rankby_group() with the Wilcoxon test to obtain a list of the top inferred transcription factors per cluster vs. the rest of the clusters, each transcription factor is ranked by score.

  1. Can I infer that a higher decoupler score for one TF means more inferred activity for that TF compared to another (assuming both TFs are significantly active in that cluster compared to the rest)? Or is that not necessarily true?

  2. If I want to compare the inferred activity of one active transcription factor to another active transcription factor to test for statistical significance within a given cluster (in order to find whether one transcription factor is more active than another TF in that cluster), what would be the best way to do that?

Thank you for a wonderful tool. I really enjoy using it.

Hi @jpagolia thanks for the kind words! Regarding your questions:

  1. Can I infer that a higher decoupler score for one TF means more inferred activity for that TF compared to another (assuming both TFs are significantly active in that cluster compared to the rest)? Or is that not necessarily true?

Yes, but with a caveat. The ULM score is a t-statistic from a univariate linear regression of a cell’s expression profile against each TF’s regulon weights. Because it’s a t-statistic, it accounts for both the effect size (how strongly the regulon targets are coordinately regulated) and the uncertainty / significance (how many targets the TF has and how variable they are). So in general, a higher score does reflect stronger inferred activity.

  1. If I want to compare the inferred activity of one active transcription factor to another active transcription factor to test for statistical significance within a given cluster (in order to find whether one transcription factor is more active than another TF in that cluster), what would be the best way to do that?

You could compute a paired comparison across cells in the cluster: extract the per-cell ULM scores for the cells in that cluster, then run a paired Wilcoxon signed-rank test (or paired t-test) comparing the score distributions of TF_A vs TF_B across those cells, this ask the question: across cells in this cluster, is TF_A’s inferred activity systematically higher than TF_B’s?

Hope that helps!