Model which can work on binary data

Gbrown · February 23, 2023, 4:25pm

Hey,

I have binary data (0/1) and not continuous expression data.
do you think its possible to use the scvi-tools models on my data?

Thank you.

Valentine_Svensson · February 27, 2023, 8:03pm

Hi,

Kind of, but if you try it, I would highly recommend doing some statistical investigation. The scvi-tools models are modeling count data (they actually don’t model continuous expression data).

Briefly, how count distributions ‘work’ is that they arise from adding together binary outcomes (e.g., if you have 7 ‘successes’ and 3 ‘failures’ you have a count of 7, where in general you include information about the fact that you looked at 10 cases in total). There are some variations on this that include variation on top of the counting procedure (the default ZINB distribution in scvi-tools for scRNA-seq models this counting procedure + differences in efficiencies per observation + counts missing at random).

With this in mind, a count of 1 is a count, which differ from a 0 count. You would model the binary case with a Bernoulli distribution. The first step of accumulating binary data is to move to the binomial distribution, where you say “I have x positive cases out of n trials”. But you can set n = 1 and you are back at the Bernoulli case.

What I would do, is I’d try the models in there, with the likelihood parameter set to ‘poisson’, as a pilot and see if I get anything.

If results are promising, I would use the scvi-tools skeleton (GitHub - scverse/scvi-tools-skeleton: Template repository for creating novel models with scvi-tools) and basically copy in the components of the model you are using, but in the loss function in _mymodule.py (line 177) replace ZeroInflatedNegativeBinomial with torch.distributions.Bernoulli (Probability distributions - torch.distributions — PyTorch 1.13 documentation).

Hope this helps!
/Valentine

Topic		Replies	Views
Validation of Developer Guide codes for scVI implementation scvi-tools developer	2	428	August 30, 2021
Negative binomial regression on numerical variables scvi-tools scvi	1	369	October 5, 2022
Raw count input for scVI cell clustering scvi-tools	2	350	November 8, 2023
Inquiry about Data Input and DE Analysis Details in scVI scvi-tools diff-exp , scvi	4	267	May 3, 2024
Minimum number of cells for scVI? scvi-tools scvi	2	349	February 15, 2023

Model which can work on binary data

Related topics