Question:
How do I set torch.utils.data.DataLoader(drop_last=True)
.
In this case, runtime or at the model level works equally well.
Background:
I need to do some latent space arithmetic that requires even batch sizes
How do I set torch.utils.data.DataLoader(drop_last=True)
.
In this case, runtime or at the model level works equally well.
I need to do some latent space arithmetic that requires even batch sizes
In the meantime, I was able to simulate DataLoader(drop_last=True)
by subsetting the data so that all batches are of size minibatch_size
:
minibatch_size=100
sc.pp.subsample(adata,n_obs=adata.shape[0] - np.mod(adata.shape[0],minibatch_size))
You would need to write your own datasplitter. See here:
This would then be used in your own custom train function via the trainrunner
Thanks @adamgayoso!
Following your comment, I’ve created a pull request to provide some additional functionality to DataSplitter
and SemiUnsupervisedDataSplitter
which makes it possible to define defaults for all data_loader_kwargs
(including the existing drop_last=3
), while simultaneously keeping AnnDataLoader transparent to the Lightning DataLoader API wrt parameters like drop_last
.