Nan values in the latent of scanvi_query

Hi

I’m trying to do a label transfer using SCANVI using the following script:

import scanpy as sc
import scvi # scvi==1.0.4
from scvi.model import SCVI
from scvi.model import SCANVI
import rapids_singlecell as rsc

pancreas_full = sc.read_h5ad('./data/t2d/pancreas_full_v2.h5ad')

rmask = pancreas_full.obs['batch']=='reference'
rdata = pancreas_full[rmask].copy()
qdata = pancreas_full[~rmask].copy()

rdata.obs['cell_type_scanvi'] = rdata.obs["cell_type"].values

scvi.model.SCVI.setup_anndata(rdata, layer="counts", batch_key="batch")

scvi_ref = scvi.model.SCVI(rdata, n_layers=2, n_latent=30, gene_likelihood="nb")
scvi_ref.train(
         max_epochs=1000, 
         check_val_every_n_epoch=10
     )

 scanvi_ref = scvi.model.SCANVI.from_scvi_model(scvi_ref,
                                                labels_key='cell_type_scanvi',
                                                unlabeled_category="Unknown")
 scanvi_ref.train(
         max_epochs=1000, 
         check_val_every_n_epoch=10
 )

scvi.model.SCANVI.prepare_query_anndata(qdata, scanvi_ref)

scanvi_query = scvi.model.SCANVI.load_query_data(qdata, scanvi_ref)

scanvi_query.train(
    max_epochs=1000,
    plan_kwargs={"weight_decay": 0.0},
    check_val_every_n_epoch=10,
)

but during the training of scanvi_query, on first epoch, latent space get’s full nan matrix and stops. I should mention that a colleague at TheisLab has also reported this behaviour and we think we should avoid training scvi for too many epochs.

TBH, the tutorial of scanvi sounds a bit ambiguous for me. it’s not clear enough to me that why we need scanvi_ref and scanvi_query as two distinct models. I mean my expected script based on the methodology of scanvi would be that:

  • we don’t need scanvi_ref
  • get scanvi_qurery by running scanvi_query = scvi.model.SCANVI.load_query_data(qdata, scvi_ref)
  • scanvi_query.train() and then:
  • qdata.obs[SCANVI_PREDICTIONS_KEY] = scanvi_query.predict()

p.s. I tried adding inplace_subset_query_vars=True but ran out of memory

Thanks for your help in advance

Hi,

  1. Why not working with most recent version of scvi? many things have changed since, and it will be not so easy to tackle your issue, if it relates to that version alone.

  2. Having said that, seems your process is ok.
    But getting a fail on the first epoch usually means something is wrong in setup of data or data quality itself. see Frequently asked questions — scvi-tools
    given you haven’t selected your highly variables genes before training, you how big is you data? can it be related to memory issues? how did you make layer “counts” ?
    note that in the tutorial we train scvi for scArches with additional parameters which tend to work well with scArches (although its not mandatory):

scvi_ref = scvi.model.SCVI(rdata, n_layers=2, n_latent=30, gene_likelihood="nb",
                           use_layer_norm="both",
                           use_batch_norm="none",
                           encode_covariates=True,
                           dropout_rate=0.2,
                           )

in the same manner you can set n_samples_per_label=100 in the scanvi_ref train.

you can definitely train scanvi_ref/query for just a several epochs

  1. You cant load query data from model SCANVI while the ref model is SCVI.
    Plus, think you need to use the ref model of scanvi to predict cell types of a new unseen dataset (ignore its a tutorial where we split the same data in 2 parts). scvi ref cant perform prediction you need additional layer for this.