Impact of tmin parameter (decoupler.mt.ulm)

Hello,

I wonder the impact of modifying tmin parameter in decoupler.mt.ulm when focusing on Gene Regulatory Networks .

As a recall, tmin is the minimum number of targets per source. I guess by increasing tmin we have less TF in the analysis meaning less multiple comparisons and thus lower false positive rate (= TF reported to be enriched but are not). While by decreasing tmin, we also include TF with just a few targets and therefore offer allow new discoveries (better for exploration) at the cost of increasing the noise and false positive rate. Is it correct ?

Thank you for any insight

Hi @npont,

This is absolutely correct! But basically tmin is there to remove extreme outliers, like for example a TF explained by only 2 genes, or just one gene. Remember that this analysis to it’s core is just a competitive set enrichment test, we are comparing whether a set of features are over/under expressed compared to a background. Imagine the simplest statistic, for example the mean. If the set is only comprised of two features, can we really trust this statistic? Would you trust a mean of two values, of just one? Hence why the minimum is set to 5, but it is completely arbitrary. Hope this is helpful!