where it first concatenated the sample(s) of unknown labels with the reference. Now ref and my samples have different batches then I trained vae=scvi.model.SCVI(adata) without any extra arguments and it ran for 6 epoch only ( data is about 1.4 million) so I wonder if 6 epoch is a good number and how to assess that the model is accurate??
was done to predict the labels of my samples which have unknown. I want to ask what does the n_samples_per_label mean ? what I understand is that it takes representative cells for each label in this case 100 cells. those representative cells from the unknown cells? or what?
I would appreciate it if you help regarding this method
I tend to train for at least 20 epochs. However, this is more an experience based thing. You should check elbo_validation and elbo_train afterwards. You can increase batch_size to 1024 (increases runtime by a factor of 8). Yes it takes 1000 representative cells for each celltype (or if there are less than 100 cells of a celltype all cells of this type). The classifier doesn’t have balanced weights and this helps with balancing.
vae
SCVI model with the following parameters:
n_hidden: 128, n_latent: 10, n_layers: 1, dropout_rate: 0.1, dispersion: gene,
gene_likelihood: zinb, latent_distribution: normal.
Training status: Trained
Model’s adata is minified?: False
lvae
ScanVI Model with the following params:
unlabeled_category: Unknown, n_hidden: 128, n_latent: 10, n_layers: 1,
dropout_rate: 0.1, dispersion: gene, gene_likelihood: zinb
Training status: Trained
Model’s adata is minified?: False