How to save intermediate checkpoints?

gregjohnso · August 29, 2023, 12:14am

What is the standard way of saving intermediate states during model training? We came up with a workaround (here) using PyTorch lightning callbacks as indicated in the documentation (here), but that largely seems unsupported and completely hacky.

Is there a way to periodically save checkpoints during training that is natively supported by scvi-tools, and also retains trainer state?

martinkim0 · September 14, 2023, 5:24am

Hi @gregjohnso, thank you for your question. We don’t have a super straightforward way of saving intermediate checkpoints as of right now. I’ll see if I can implement a solution for this for our next release - for now, you can track the feature request here: Add changes to make saving intermediate checkpoints easier · Issue #2264 · scverse/scvi-tools · GitHub

martinkim0 · November 16, 2023, 7:50pm

Hi @gregjohnso, just wanted to update you that we now have an experimental SaveCheckpoint callback that subclasses Lightning’s ModelCheckpoint for compatibility with our model saves. You can enable this automatically by passing in enable_checkpointing=True into most train methods, or passing it in explicitly with the callbacks argument. We are anticipating to release this by the end of the year with our 1.1 release. In the meantime, feel free to install from the main branch - feedback is appreciated!

gregjohnso · February 3, 2024, 7:46pm

awesome thank you @martinkim0 !

Topic		Replies	Views
Loading an scVI model from a pytorch lightning checkpoint scvi-tools scvi	3	846	August 29, 2023
Resume training from model checkpoints scvi-tools scanvi , scvi	1	651	March 18, 2022
Resuming training with scVI scvi-tools scvi	2	661	November 12, 2020
Pass scvi models to interpretation algorithm for downstream analysis scvi-tools scvi	1	500	July 15, 2021
Best practice for handling custom points during SCVI training scvi-tools	4	64	January 24, 2025

How to save intermediate checkpoints?

Related topics