Hi!
I have some (rather naive) questions about the new MrVI tool. Given that it predicts counterfactual cell states for each cell across samples, is there a recommended minimum number of cells/sample, for a sample to be included in this analysis? Intuitionally, I would think that if a sample has few cells, the algorithm will have a hard time predicting the effect of that sample on any cell.
Similarly, if two samples belong to different tissues (cross organ integration), would the algorithm be essentially trying to predict what a cell from tissue A would look like if it belonged to tissue B, which perhaps might not make a lot of biological sense?
Indeed, the algorithm relies on representative cellular states being present and will otherwise not predict reasonable results. We performed cross-tissue analysis in a limited sense within the manuscript (ileum and colon of inflammatory bowel disease) and we performed extensive cross-tissue analysis within https://www.biorxiv.org/content/10.1101/2024.01.03.573877v2.full. Obviously this only makes sense for cell-types that are present in both tissues. To reduce the risk of this, we provide an option to filter samples for counterfactuals that have no similar cells present using the procedure described in the manuscript.