Scpoli use guidance


If any of the developers of scPoli are here, I would be grateful for some guidance on the best approach for an analysis.

I read the paper and have had a look at the documentation, but the documentation is a little bare on more complex analysis.

If I am integrating unlabelled datasets across multiple diseases and health, which of the following approach is most appropriate to maximise chances of detecting disease associated variation/cell-states:

  1. Make a reference from healthy donors, annotate and map disease samples to the reference? As is suggested to be a valid method via the lung atlas papers.

  2. Integrate everything all together and pre-annotate individual datasets. As suggested by figure 6 in the scPoli paper.

It is a little confusing, which is the most appropriate approach. The goal is to have the integrated model as a reference model to map new data to, but also to do a meta analysis of driver signatures of disease states.

Many thanks for any help