Dear SCVI team,
Many thanks again for the brilliant suite of tools.
May I ask how TotalVI handles missing ADT data?
The scenario would be as follows:
- Gene and CITEseq counts were generated separately on cell ranger
- Cellranger filtered matrices for CITEseq may miss many good quality droplets from Gene expression data.
- To preserve good quality GEX cells, raw CITEseq matrices were merged, therefore may include droplets where CITEseq quality was poor/mostly ambient.
The questions are:
Would running TotalVI in this context be able to model that some CITEseq values in cells with poor quality CITEseq counts (but good quality GEX) are all background?
Would the better approach be to zero all the CITEseq for cells that did not pass ADT barcode filtering?
2a) If we zero a subset of cells where ADT did not pass filtering, will TotalVI consider ADT missing for these? or do we need to assign them as a separate batch?
- This is mainly referring to the line "when it’s all zeros, totalVI identifies that the protein data is missing in this “batch”. from this link (Reference mapping with scvi-tools - scvi-tools)
Many thanks in advance for any advice.