scANVI fails and returns NaNs after few epochs

temitopeleke · November 11, 2022, 6:44pm

Hello,

Backstorry:

I successfully installed scvi-tools on Apple Silicon. This successfully trained on small dataset of (20k cells, 3k genes), however performance was extremely impeded when dataset size was (20k x 17k). So, I decided to use the accelerated PyTorch training on MPS. However, the aten::remainder.Tensor_out operation is not yet implemented using MPS so I get the following error

NotImplementedError: Could not run 'aten::index.Tensor' with arguments from the 'MPS' backend.

This was described in this github issue, which suggested to set the PYTORCH_ENABLE_MPS_FALLBACK=1 environment variable to use CPU fall back instead for the operation.

Training worked as a result, but always fails after some epochs, or even first epoch and returns the rror;

ValueError: Expected parameter loc (Tensor of shape (X, Y))) of distribution Normal(loc: torch.Size([X, Y]), scale: torch.Size([X, Y)) to satisfy the constraint Real(), but found invalid values

Can anyone please help with this?

adamgayoso · November 11, 2022, 7:58pm

PyTorch MPS support is not fully operational (i.e., some tensor operations fail as this happened to you). Therefore, M1 support is restricted to CPU. We do not anticipate much of a speedup for scvi models anyway on MPS.

temitopeleke · November 11, 2022, 8:15pm

Thanks a lot for the quick reply!

I’m fairly new to this space. Since all my personal compute resource are apple (silicon) based, are there any platforms with GPU support that you could recommend (something like sagemaker) etc.

Thanks!

Topic		Replies	Views
Error when training model on M3 Max MPS scvi-tools	5	1033	February 16, 2024
Macbook M1 M2 mps acceleration with scVI scvi-tools developer	9	1629	May 30, 2025
M1 MAX: GPU available, but not used scvi-tools	4	1347	April 13, 2023
Any suggestions for speeding up model training (vae and solo) on M2 mac scvi-tools	1	288	July 8, 2024
Scvi model kills kernel when .train() is called on Macbook Pro with Apple M1 Pro Chip scvi-tools scvi	3	431	December 1, 2022

scANVI fails and returns NaNs after few epochs

Related topics