How to set DataLoader(drop_last=True) for a model?


How do I set
In this case, runtime or at the model level works equally well.


I need to do some latent space arithmetic that requires even batch sizes

In the meantime, I was able to simulate DataLoader(drop_last=True) by subsetting the data so that all batches are of size minibatch_size:

sc.pp.subsample(adata,n_obs=adata.shape[0] - np.mod(adata.shape[0],minibatch_size))

You would need to write your own datasplitter. See here:

This would then be used in your own custom train function via the trainrunner

Thanks @adamgayoso!

Following your comment, I’ve created a pull request to provide some additional functionality to DataSplitter and SemiUnsupervisedDataSplitter which makes it possible to define defaults for all data_loader_kwargs (including the existing drop_last=3), while simultaneously keeping AnnDataLoader transparent to the Lightning DataLoader API wrt parameters like drop_last.