Hi,
I would like to use GPU to speed up the process. I got some errors…It seems to result from PyTorch and CUDA so I tried to re-install PyTorch. But it still failed…
Just wonder if you have any idea? Thanks!!!
========
Sys.setenv(CUDA_VISIBLE_DEVICES = “1”)
torch$cuda$is_available()
[1] TRUE
scvi$model$SCVI$setup_anndata(adata_5_in_1, batch_key = “batch”)
model ← scvi$model$SCVI(adata_5_in_1)
model$train(accelerator=“gpu”)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
Error in py_call_impl(callable, call_args$unnamed, call_args$named) :
torch.cuda.DeferredCudaCallError: CUDA call failed lazily at initialization with error: device >= 0 && device < num_gpus INTERNAL ASSERT FAILED at “…/aten/src/ATen/cuda/CUDAContext.cpp”:49, please report a bug to PyTorch. device=1, num_gpus=
<…truncated…> 1360, in _find_and_load
File “”, line 1331, in _find_and_load_unlocked
File “”, line 935, in _load_unlocked
File “”, line 999, in exec_module
File “”, line 488, in _call_with_frames_removed
File “/home/guest001/miniconda3/envs/scvi-env/lib/python3.12/site-packages/torch/cuda/init.py”, line 264, in
_lazy_call(_check_capability)
File “/home/guest001/miniconda3/envs/scvi-env/lib/python3.12/site-packages/torch/cuda/init.py”, line 261, in _lazy_call
_queued_calls.append((callable, traceback.format_stack()))
Run reticulate::py_last_error()
for details.
── Python Exception Message ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Traceback (most recent call last):
File “/home/guest001/miniconda3/envs/scvi-env/lib/python3.12/site-packages/torch/cuda/init.py”, line 332, in _lazy_init
queued_call()
File “/home/guest001/miniconda3/envs/scvi-env/lib/python3.12/site-packages/torch/cuda/init.py”, line 200, in _check_capability
capability = get_device_capability(d)
^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/guest001/miniconda3/envs/scvi-env/lib/python3.12/site-packages/torch/cuda/init.py”, line 509, in get_device_capability
prop = get_device_properties(device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/guest001/miniconda3/envs/scvi-env/lib/python3.12/site-packages/torch/cuda/init.py”, line 527, in get_device_properties
return _get_device_properties(device) # type: ignore[name-defined]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: device >= 0 && device < num_gpus INTERNAL ASSERT FAILED at “…/aten/src/ATen/cuda/CUDAContext.cpp”:49, please report a bug to PyTorch. device=1, num_gpus=
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File “/home/guest001/miniconda3/envs/scvi-env/lib/python3.12/site-packages/scvi/model/base/_training_mixin.py”, line 161, in train
return runner()
^^^^^^^^
File “/home/guest001/miniconda3/envs/scvi-env/lib/python3.12/site-packages/scvi/train/_trainrunner.py”, line 96, in call
self.trainer.fit(self.training_plan, self.data_splitter)
File “/home/guest001/miniconda3/envs/scvi-env/lib/python3.12/site-packages/scvi/train/_trainer.py”, line 210, in fit
super().fit(*args, **kwargs)
File “/home/guest001/miniconda3/envs/scvi-env/lib/python3.12/site-packages/lightning/pytorch/trainer/trainer.py”, line 539, in fit
call._call_and_handle_interrupt(
File “/home/guest001/miniconda3/envs/scvi-env/lib/python3.12/site-packages/lightning/pytorch/trainer/call.py”, line 47, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/guest001/miniconda3/envs/scvi-env/lib/python3.12/site-packages/lightning/pytorch/trainer/trainer.py”, line 575, in _fit_impl
self._run(model, ckpt_path=ckpt_path)
File “/home/guest001/miniconda3/envs/scvi-env/lib/python3.12/site-packages/lightning/pytorch/trainer/trainer.py”, line 938, in _run
self.strategy.setup_environment()
File “/home/guest001/miniconda3/envs/scvi-env/lib/python3.12/site-packages/lightning/pytorch/strategies/strategy.py”, line 129, in setup_environment
self.accelerator.setup_device(self.root_device)
File “/home/guest001/miniconda3/envs/scvi-env/lib/python3.12/site-packages/lightning/pytorch/accelerators/cuda.py”, line 46, in setup_device
_check_cuda_matmul_precision(device)
File “/home/guest001/miniconda3/envs/scvi-env/lib/python3.12/site-packages/lightning/fabric/accelerators/cuda.py”, line 161, in _check_cuda_matmul_precision
if not torch.cuda.is_available() or not _is_ampere_or_later(device):
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/guest001/miniconda3/envs/scvi-env/lib/python3.12/site-packages/lightning/fabric/accelerators/cuda.py”, line 155, in _is_ampere_or_later
major, _ = torch.cuda.get_device_capability(device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/guest001/miniconda3/envs/scvi-env/lib/python3.12/site-packages/torch/cuda/init.py”, line 509, in get_device_capability
prop = get_device_properties(device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/guest001/miniconda3/envs/scvi-env/lib/python3.12/site-packages/torch/cuda/init.py”, line 523, in get_device_properties
_lazy_init() # will define _get_device_properties
^^^^^^^^^^^^
File “/home/guest001/miniconda3/envs/scvi-env/lib/python3.12/site-packages/torch/cuda/init.py”, line 338, in _lazy_init
raise DeferredCudaCallError(msg) from e
torch.cuda.DeferredCudaCallError: CUDA call failed lazily at initialization with error: device >= 0 && device < num_gpus INTERNAL ASSERT FAILED at “…/aten/src/ATen/cuda/CUDAContext.cpp”:49, please report a bug to PyTorch. device=1, num_gpus=
CUDA call was originally invoked at:
File “/home/guest001/miniconda3/envs/scvi-env/lib/R/library/reticulate/python/rpytools/loader.py”, line 122, in _find_and_load_hook
return _run_hook(name, _hook)
File “/home/guest001/miniconda3/envs/scvi-env/lib/R/library/reticulate/python/rpytools/loader.py”, line 96, in _run_hook
module = hook()
File “/home/guest001/miniconda3/envs/scvi-env/lib/R/library/reticulate/python/rpytools/loader.py”, line 120, in _hook
return find_and_load(name, import)
File “”, line 1360, in _find_and_load
File “”, line 1331, in _find_and_load_unlocked
File “”, line 935, in _load_unlocked
File “”, line 999, in exec_module
File “”, line 488, in _call_with_frames_removed
File “/home/guest001/miniconda3/envs/scvi-env/lib/python3.12/site-packages/anndata/init.py”, line 39, in
from .io import read_h5ad, read_zarr
File “/home/guest001/miniconda3/envs/scvi-env/lib/R/library/reticulate/python/rpytools/loader.py”, line 122, in _find_and_load_hook
return _run_hook(name, _hook)
File “/home/guest001/miniconda3/envs/scvi-env/lib/R/library/reticulate/python/rpytools/loader.py”, line 96, in _run_hook
module = hook()
File “/home/guest001/miniconda3/envs/scvi-env/lib/R/library/reticulate/python/rpytools/loader.py”, line 120, in _hook
return find_and_load(name, import)
File “”, line 1360, in _find_and_load
File “”, line 1331, in _find_and_load_unlocked
File “”, line 935, in _load_unlocked
File “”, line 999, in exec_module
File “”, line 488, in _call_with_frames_removed
File “/home/guest001/miniconda3/envs/scvi-env/lib/python3.12/site-packages/anndata/io.py”, line 7, in
from ._io.h5ad import read_h5ad, write_h5ad
File “/home/guest001/miniconda3/envs/scvi-env/lib/R/library/reticulate/python/rpytools/loader.py”, line 122, in _find_and_load_hook
return _run_hook(name, _hook)
File “/home/guest001/miniconda3/envs/scvi-env/lib/R/library/reticulate/python/rpytools/loader.py”, line 96, in _run_hook
module = hook()
File “/home/guest001/miniconda3/envs/scvi-env/lib/R/library/reticulate/python/rpytools/loader.py”, line 120, in _hook
return find_and_load(name, import)
File “”, line 1360, in _find_and_load
File “”, line 1331, in _find_and_load_unlocked
File “”, line 935, in _load_unlocked
File “”, line 999, in exec_module
File “”, line 488, in _call_with_frames_removed
File “/home/guest001/miniconda3/envs/scvi-env/lib/python3.12/site-packages/anndata/_io/h5ad.py”, line 25, in
from …experimental import read_dispatched
File “/home/guest001/miniconda3/envs/scvi-env/lib/R/library/reticulate/python/rpytools/loader.py”, line 122, in _find_and_load_hook
return _run_hook(name, _hook)
File “/home/guest001/miniconda3/envs/scvi-env/lib/R/library/reticulate/python/rpytools/loader.py”, line 96, in _run_hook
module = hook()
File “/home/guest001/miniconda3/envs/scvi-env/lib/R/library/reticulate/python/rpytools/loader.py”, line 120, in _hook
return find_and_load(name, import)
File “”, line 1360, in _find_and_load
File “”, line 1331, in _find_and_load_unlocked
File “”, line 935, in _load_unlocked
File “”, line 999, in exec_module
File “”, line 488, in _call_with_frames_removed
File “/home/guest001/miniconda3/envs/scvi-env/lib/python3.12/site-packages/anndata/experimental/init.py”, line 12, in
from .pytorch import AnnLoader
File “/home/guest001/miniconda3/envs/scvi-env/lib/R/library/reticulate/python/rpytools/loader.py”, line 122, in _find_and_load_hook
return _run_hook(name, _hook)
File “/home/guest001/miniconda3/envs/scvi-env/lib/R/library/reticulate/python/rpytools/loader.py”, line 96, in _run_hook
module = hook()
File “/home/guest001/miniconda3/envs/scvi-env/lib/R/library/reticulate/python/rpytools/loader.py”, line 120, in _hook
return find_and_load(name, import)
File “”, line 1360, in _find_and_load
File “”, line 1331, in _find_and_load_unlocked
File “”, line 935, in _load_unlocked
File “”, line 999, in exec_module
File “”, line 488, in _call_with_frames_removed
File “/home/guest001/miniconda3/envs/scvi-env/lib/python3.12/site-packages/anndata/experimental/pytorch/init.py”, line 3, in
from ._annloader import AnnLoader
File “/home/guest001/miniconda3/envs/scvi-env/lib/R/library/reticulate/python/rpytools/loader.py”, line 122, in _find_and_load_hook
return _run_hook(name, _hook)
File “/home/guest001/miniconda3/envs/scvi-env/lib/R/library/reticulate/python/rpytools/loader.py”, line 96, in _run_hook
module = hook()
File “/home/guest001/miniconda3/envs/scvi-env/lib/R/library/reticulate/python/rpytools/loader.py”, line 120, in _hook
return find_and_load(name, import)
File “”, line 1360, in _find_and_load
File “”, line 1331, in _find_and_load_unlocked
File “”, line 935, in _load_unlocked
File “”, line 999, in exec_module
File “”, line 488, in _call_with_frames_removed
File “/home/guest001/miniconda3/envs/scvi-env/lib/python3.12/site-packages/anndata/experimental/pytorch/_annloader.py”, line 19, in
import torch
File “/home/guest001/miniconda3/envs/scvi-env/lib/R/library/reticulate/python/rpytools/loader.py”, line 122, in _find_and_load_hook
return _run_hook(name, _hook)
File “/home/guest001/miniconda3/envs/scvi-env/lib/R/library/reticulate/python/rpytools/loader.py”, line 96, in _run_hook
module = hook()
File “/home/guest001/miniconda3/envs/scvi-env/lib/R/library/reticulate/python/rpytools/loader.py”, line 120, in _hook
return find_and_load(name, import)
File “”, line 1360, in _find_and_load
File “”, line 1331, in _find_and_load_unlocked
File “”, line 935, in _load_unlocked
File “”, line 999, in exec_module
File “”, line 488, in _call_with_frames_removed
File “/home/guest001/miniconda3/envs/scvi-env/lib/python3.12/site-packages/torch/init.py”, line 1954, in
_C._initExtension(_manager_path())
File “/home/guest001/miniconda3/envs/scvi-env/lib/R/library/reticulate/python/rpytools/loader.py”, line 122, in _find_and_load_hook
return _run_hook(name, _hook)
File “/home/guest001/miniconda3/envs/scvi-env/lib/R/library/reticulate/python/rpytools/loader.py”, line 96, in _run_hook
module = hook()
File “/home/guest001/miniconda3/envs/scvi-env/lib/R/library/reticulate/python/rpytools/loader.py”, line 120, in _hook
return find_and_load(name, import)
File “”, line 1360, in _find_and_load
File “”, line 1331, in _find_and_load_unlocked
File “”, line 935, in _load_unlocked
File “”, line 999, in exec_module
File “”, line 488, in _call_with_frames_removed
File “/home/guest001/miniconda3/envs/scvi-env/lib/python3.12/site-packages/torch/cuda/init.py”, line 264, in
_lazy_call(_check_capability)
File “/home/guest001/miniconda3/envs/scvi-env/lib/python3.12/site-packages/torch/cuda/init.py”, line 261, in _lazy_call
_queued_calls.append((callable, traceback.format_stack()))
── R Traceback ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
▆
- └─model$train(accelerator = “gpu”)
- └─reticulate:::py_call_impl(callable, call_args$unnamed, call_args$named)
See reticulate::py_last_error()$r_trace$full_call
for more details.
=============
Here is my GPU information:
Wed Mar 5 15:50:35 2025
±----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07 Driver Version: 550.90.07 CUDA Version: 12.4 |
|-----------------------------------------±-----------------------±---------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 Quadro P4000 Off | 00000000:17:00.0 Off | N/A |
| 85% 85C P0 66W / 105W | 7461MiB / 8192MiB | 100% Default |
| | | N/A |
±----------------------------------------±-----------------------±---------------------+
| 1 Quadro P1000 Off | 00000000:73:00.0 Off | N/A |
| 34% 29C P8 N/A / N/A | 61MiB / 4096MiB | 0% Default |
| | | N/A |
±----------------------------------------±-----------------------±---------------------+
±----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 3725 G /usr/lib/xorg/Xorg 4MiB |
| 0 N/A N/A 6867 C dorado_basecall_server 910MiB |
| 0 N/A N/A 1438629 C guppy_basecaller 6542MiB |
| 1 N/A N/A 3725 G /usr/lib/xorg/Xorg 12MiB |
| 1 N/A N/A 5471 G /usr/bin/gnome-shell 2MiB |
| 1 N/A N/A 6867 C dorado_basecall_server 40MiB |
±----------------------------------------------------------------------------------------+
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2025 NVIDIA Corporation
Built on Wed_Jan_15_19:20:09_PST_2025
Cuda compilation tools, release 12.8, V12.8.61
Build cuda_12.8.r12.8/compiler.35404655_0
======== re-installation
pip uninstall torch torchvision torchaudio -y
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121