Hi, I am trying to integrate a 4-million cells scRNA data set. The adata has been subset to top 2000 highly variable genes. I set up the model use almost default parameters.

model = scvi.model.SCVI(adata, n_layers=2, n_latent=30, gene_likelihood=“nb”)
model.train()

But I got only 2 epochs for the training. What decides the number of epochs for the training?

Hi. We have a heuristic that sets the number of epochs. However, I would recommend to use 20 epochs (max_epochs=20) in model.train. You should check model convergence afterwards (see tutorials for plotting losses).

But the umap plot is centered and the legend on the right side is cut out. I changed the figure size to (12,6) but it only elongated the umap plot horizontally. How should I adjust the parameters to make the legend visible?

It might be that there are single cells that are outliers in UMAP, so the plot doesn’t populate the whole area. If you save as png the legend should be within the plot. I’m not exactly sure why in PDF the legend is outside of the plot. You can set tight_layout before saving it in Matplotlib. If things are not working, please share your full code for plotting.