Iteration consumes memory

Florian_Deckert · October 22, 2024, 4:03pm

Dear all,

I have a AnnData object with data from six patients. When I loop through the patient ids, subset the data, and compute the standard scVI workflow the first patient id needs about 3 minutes to complete. However, the second patient subset takes hours. But if I just run the second patient id on its own it only takes 3 minutes as well.

I think that issue is similar to the circular referencing problem with AnnData. (Anndata not properly garbage collected · Issue #360 · scverse/anndata · GitHub).

I highly appreciate your input on how to ensure that a scVI process is removed from memory. For example in the setting when I looping through patient ids.

Many thanks for your help!

Best, Florian

cane11 · October 22, 2024, 4:51pm

Hi, I don’t think your workflow becomes clear. Why do you want to run it seperately per patient? Do you see full GPU or system RAM?
This could explain the slowdown. Disabling pinning of memory on GPU would then help.

Florian_Deckert · October 23, 2024, 7:41am

Hi @cane11,

thank you very much for your time. I run scVI per patient and use the model for SOLO doublet detection. Currently I use CPU on our HPC (python v.3.9.19, scvi-tools v.1.1.6.post2).

With scvi.settings.num_threads=38 and in the Slurm --nodes=1, --ntasks=1, --cpus-per-task 38.

It feels like that after the training the model for the first patient id the CPU resources are still occupied. Do you have any idea what I could try to free the resources?

Many thanks,
Florian

Florian_Deckert · October 23, 2024, 12:28pm

I don’t know if that is good coding practice but importing scvi within the function and using Process seems to free up the CPU after training a model.

def scvi_worflow(adata, patient_id):
import scvi # Import modules 
scvi.settings.num_threads=8 # Set SCVI threads
adata_i = adata[adata.obs['patient_id']==patient_id] 

[...] # Train model etc. 

del adata_i, model_i
gc.collect()

return None 
from multiprocessing import Process
for patient_id in adata.obs[‘patient_id’].cat.categories:
p = Process(target=scvi_worflow, args=(adata, patient_id))
p.start()
p.join()

cane11 · October 23, 2024, 4:27pm

My first intuition is: Do you enable persistent workers? Without it using multiple jobs doesn’t really speed things up, with setting it the workers stay persistent even after training as we don’t kill the dataloader. Deleting the model should be sufficient though - is the gc step really needed? See What are the (dis) advantages of persistent_workers - #8 by albanD - vision - PyTorch Forums for a longer discussion. It would be helpful to get a more complete set of the script.

Florian_Deckert · November 7, 2024, 11:38am

Hi @cane11, thank you very much for the pointers! I successfully failed to replicate the behavior with other and my own data. However, I realized that there were server updates running regarding the CPU distribution while the problem occurred. Maybe that caused the behavior but I can’t be sure. My apologies.

Topic		Replies	Views
scVI data set size runtime question scvi-tools scvi	4	1047	February 18, 2022
Scvi performance on nvidia h100 vs a100 scvi-tools	4	235	March 1, 2024
SCVI tools with large datasets scvi-tools	3	680	May 31, 2024
scVI with large datasets scvi-tools	4	215	September 24, 2024
Batch-Specific Training for Doublet Removal Help scvi , solo	0	423	October 4, 2022

Iteration consumes memory

Related topics