I am currently trying to combine data from two studies with multiple donors per study were we are integrating on donors. One of the studies has labels, the other doesn’t. This is a clear cut case for scANVI and it works great. However, while doing this I was considering it could be nice that the integrated set be available to others to built on with online updating / architectural surgery.
However, in the scANVI tutorial for building reference maps all cells that go into the reference maps are labelled, while the query can be either labelled or unlabelled. So the question becomes:
Does it make sense to create a reference atlas following the scANVI guide for online updating with partially labelled data? This would create the reference model and label the unlabelled sample in one go. Are there any potential issues with this approach?
If not, then I guess an approach is to do label transfer first using scANVI and subsequently create a reference atlas using the now fully (transfer) labelled datasets. However, to me this seems a bit circular?
Thanks in advance