How to set DataLoader(drop_last=True) for a model?

Question:

How do I set torch.utils.data.DataLoader(drop_last=True).
In this case, runtime or at the model level works equally well.

Background:

I need to do some latent space arithmetic that requires even batch sizes

In the meantime, I was able to simulate DataLoader(drop_last=True) by subsetting the data so that all batches are of size minibatch_size:

minibatch_size=100
sc.pp.subsample(adata,n_obs=adata.shape[0] - np.mod(adata.shape[0],minibatch_size))

You would need to write your own datasplitter. See here:

This would then be used in your own custom train function via the trainrunner

Thanks @adamgayoso!

Following your comment, I’ve created a pull request to provide some additional functionality to DataSplitter and SemiUnsupervisedDataSplitter which makes it possible to define defaults for all data_loader_kwargs (including the existing drop_last=3), while simultaneously keeping AnnDataLoader transparent to the Lightning DataLoader API wrt parameters like drop_last.