Hi,
I am performing doublet removal using scvi-tools. The following code is working on my other dataset. However, in an alternate concatenated dataset, it fails. How to overcome this problem? I checked there is no NaN values in my input dataset.
sc.pp.filter_genes(doublet_adata, min_cells = 10)
sc.pp.highly_variable_genes(doublet_adata, n_top_genes = 2000, subset = True, flavor = 'seurat_v3')
filtered out 20160 genes that are detected in less than 10 cells
extracting highly variable genes
→ added
‘highly_variable’, boolean vector (adata.var)
‘highly_variable_rank’, float vector (adata.var)
‘means’, float vector (adata.var)
‘variances’, float vector (adata.var)
‘variances_norm’, float vector (adata.var)
doublet_adata
AnnData object with n_obs × n_vars = 216049 × 2000
obs: ‘sample_id’, ‘diagnosis’, ‘tissue_location’, ‘Glioma_grade’, ‘_scvi_batch’, ‘_scvi_labels’
var: ‘gene_ids’, ‘n_cells’, ‘highly_variable’, ‘highly_variable_rank’, ‘means’, ‘variances’, ‘variances_norm’
uns: ‘hvg’, ‘_scvi_uuid’, ‘_scvi_manager_uuid’
doublet_adata.obs.isnull().any()
sample_id False
diagnosis False
tissue_location False
Glioma_grade False
_scvi_batch False
_scvi_labels False
dtype: bool
doublet_adata.var.isnull().any()
gene_ids False
n_cells False
highly_variable False
highly_variable_rank False
means False
variances False
variances_norm False
dtype: bool
scvi.model.SCVI.setup_anndata(doublet_adata)
vae = scvi.model.SCVI(doublet_adata)
vae.train(batch_size=130)
GPU available: True (mps), used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
Epoch 37/37: 100%|████████████████████████████████████████████████████████████████████████████████████████████| 37/37 [10:28<00:00, 16.95s/it, v_num=1, train_loss_step=21.7, train_loss_epoch=39.6]
Trainer.fit
stopped: max_epochs=37
reached.
Epoch 37/37: 100%|████████████████████████████████████████████████████████████████████████████████████████████| 37/37 [10:28<00:00, 16.99s/it, v_num=1, train_loss_step=21.7, train_loss_epoch=39.6]
solo = scvi.external.SOLO.from_scvi_model(vae)
solo.train(batch_size=130)
INFO Creating doublets, preparing SOLO model.
GPU available: True (mps), used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
Epoch 1/400: 0%|▏ | 1/400 [00:08<54:12, 8.15s/it, v_num=1, train_loss_step=nan, train_loss_epoch=nan]
Monitored metric validation_loss = nan is not finite. Previous best value was inf. Signaling Trainer to stop.
Thanks