Any suggestions for speeding up model training (vae and solo) on M2 mac

pointwave · July 6, 2024, 10:11pm

Hello scverse!
first time asking a question so let me know what I can improve
I’ve been trying to train the vae and solo models but using my MPS gpu throws the same error mentioned in this post (Error when training model on M3 Max MPS) so I’ve been going cpu only. It is absolutely slow because of the dataset size (700000 x 35000), but I was wondering if you all had any suggestions for things I could do to make sure this is going at the max possible speed.

I’ve been running this code

scvi.settings.dl_num_workers = 11
scvi.settings.batch_size = 2048
scvi.settings.num_threads = 10

scvi.model.SCVI.setup_anndata(adata)
vae = scvi.model.SCVI(adata)
vae.train()

if it helps, here is the startup output of the code above

GPU available: True (mps), used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
/opt/miniconda3/envs/scanpy_env/lib/python3.9/site-packages/lightning/pytorch/trainer/setup.py:187: GPU available but not used. You can set it by doing `Trainer(accelerator='gpu')`.
/opt/miniconda3/envs/scanpy_env/lib/python3.9/site-packages/lightning/pytorch/trainer/connectors/data_connector.py:436: Consider setting `persistent_workers=True` in 'train_dataloader' to speed up the dataloader worker initialization.

Specs: M2 Max, 94gb ram,
cpu usage during training: 50-65%
ram usage during training: 40~gb

martinkim0 · July 8, 2024, 5:34pm

I haven’t tried this recently so not sure if it’s stable, but you can install an MPS-supported version of PyTorch and then run use the MPS backend by passing in:

vae.train(accelerator="mps")

Topic		Replies	Views
M1 MAX: GPU available, but not used scvi-tools	4	1347	April 13, 2023
CUDA is available but Training scVI models is too slow scvi-tools scvi	4	136	December 4, 2024
scANVI fails and returns NaNs after few epochs Help scanvi , scvi	2	587	November 11, 2022
Error issue running with vae.train() scvi-tools	4	955	October 8, 2022
GPU available but not using scvi-tools	7	7234	March 17, 2023

Any suggestions for speeding up model training (vae and solo) on M2 mac

Related topics