Hello,
Backstorry:
I successfully installed scvi-tools on Apple Silicon. This successfully trained on small dataset of (20k cells, 3k genes), however performance was extremely impeded when dataset size was (20k x 17k). So, I decided to use the accelerated PyTorch training on MPS. However, the aten::remainder.Tensor_out
operation is not yet implemented using MPS so I get the following error
NotImplementedError: Could not run 'aten::index.Tensor' with arguments from the 'MPS' backend.
This was described in this github issue, which suggested to set the PYTORCH_ENABLE_MPS_FALLBACK=1
environment variable to use CPU fall back instead for the operation.
Training worked as a result, but always fails after some epochs, or even first epoch and returns the rror;
ValueError: Expected parameter loc (Tensor of shape (X, Y))) of distribution Normal(loc: torch.Size([X, Y]), scale: torch.Size([X, Y)) to satisfy the constraint Real(), but found invalid values
Can anyone please help with this?