Hello,
I am hoping to ask about TOTALVI. I have followed the steps in the tutorial, however when I come to do vae.train() I get the following error which I do not understand:
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
/usr/local/lib/python3.7/dist-packages/pytorch_lightning/core/datamodule.py:424: LightningDeprecationWarning: DataModule.setup has already been called, so it will not be called again. In v1.6 this behavior will change to always call DataModule.setup.
f"DataModule.{name} has already been called, so it will not be called again. "
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/data_loading.py:323: UserWarning: The number of training samples (34) is smaller than the logging interval Trainer(log_every_n_steps=50). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.
f"The number of training samples ({self.num_training_batches}) is smaller than the logging interval"
Epoch 1/400: -0%| | -1/400 [00:00<?, ?it/s]/usr/local/lib/python3.7/dist-packages/tqdm/std.py:538: TqdmWarning: clamping frac to range [0, 1]
colour=colour)
Epoch 1/400: -0%| | -1/400 [00:00<?, ?it/s]
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-70-511a56042b55> in <module>()
----> 1 vae.train()
15 frames
/usr/local/lib/python3.7/dist-packages/scvi/train/_trainingplans.py in training_epoch_end(self, outputs)
133 n_obs, elbo, rec_loss, kl_local = 0, 0, 0, 0
134 for tensors in outputs:
--> 135 elbo += tensors["reconstruction_loss_sum"] + tensors["kl_local_sum"]
136 rec_loss += tensors["reconstruction_loss_sum"]
137 kl_local += tensors["kl_local_sum"]
KeyError: 'reconstruction_loss_sum'
I’ve pretty much followed the instructions to the letter so I’m not really sure why I get this error, or indeed what it means. I wonder if you could please advise? Many thanks in advance.