Hi everyone,
I would like to propose adding a new clustering method to the scverse ecosystem (potentially as a Scanpy plugin, or as a built-in Scanpy tool). The method is called TAU Community Detection, a genetic-algorithm + Leiden–based hybrid designed for large, sparse graphs such as scRNA-seq kNN graphs.
TAU Community Detection is described and benchmarked in the following peer-reviewed work:
https://academic.oup.com/pnasnexus/article/2/6/pgad180/7187731
published in 2023. I launched a pip package that is open source and available.
What TAU Does
TAU is an evolutionary algorithm that repeatedly optimizes modularity on large graphs while using Leiden refinement internally. Conceptually, it combines:
-
Exploration (via genetic recombination, mutation, immigration)
-
Exploitation (via Leiden refinement on each offspring)
-
Parallel evaluation to handle very large graphs
It produces clusters with higher modularity than Leiden/Louvain alone.
Single-cell clustering often depends strongly on graph topology and resolution parameters. TAU can offer:
-
More stable clusters, due to searching the global space of partitions
-
Higher modularity and better separation on difficult datasets
-
Compatibility with AnnData graph structures
-
A drop-in alternative to
sc.tl.leidenandsc.tl.louvainQuestions for the maintainersBefore developing a plugin or PR, I would love you notes on the following questions:
-
Whether TAU should begin as a separate Scanpy plugin
(following the scverse extension model), or whether you prefer evaluating a PR directly. -
API conventions, especially naming and integration with
neighbors_key,use_rep, and theAnnDatagraph utilities. -
Performance standards you expect for inclusion.
-
Documentation requirements to ensure alignment with the scverse ecosystem
Thank you for your time and for maintaining this fantastic ecosystem. I’d be happy to join this community.
Looking forward to feedback!
-