Hi developers and the community,
I’m getting this issue running the MultiVI as instructed in the tutorial:
imputed_expression = model.get_normalized_expression()
>>> imputed_expression
chr1:181000-181500 chr1:191000-191500 chr1:191500-192000 ... chr17:40459500-40460000 chr17:40460000-40460500 chr17:40460500-40461000
A24-110:TATATCCTCATCCACC-1_1_1 5.421411e-06 0.000016 1.114787e-05 ... 1.362917e-06 0.000001 6.866701e-07
A24-236:CAAGAACCATAATCGT-1_1_1 5.591661e-06 0.000017 1.120139e-05 ... 1.344341e-06 0.000001 6.684292e-07
The returned dimension is 73236 x 61817 which matches the input “sc” layer, however, the label seems to be pulled from the input “atac” layer.
Also, I tried to pull the normalized_accessibility and have the following error:
imputed_accessibility = model.get_normalized_accessibility()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/andy/anaconda3/envs/marisol_zoo/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/andy/anaconda3/envs/marisol_zoo/lib/python3.12/site-packages/scvi/model/_multivi.py", line 601, in get_normalized_accessibility
return pd.DataFrame(
^^^^^^^^^^^^^
File "/home/andy/anaconda3/envs/marisol_zoo/lib/python3.12/site-packages/pandas/core/frame.py", line 831, in __init__
mgr = ndarray_to_mgr(
^^^^^^^^^^^^^^^
File "/home/andy/anaconda3/envs/marisol_zoo/lib/python3.12/site-packages/pandas/core/internals/construction.py", line 336, in ndarray_to_mgr
_check_values_indices_shape_match(values, index, columns)
File "/home/andy/anaconda3/envs/marisol_zoo/lib/python3.12/site-packages/pandas/core/internals/construction.py", line 420, in _check_values_indices_shape_match
raise ValueError(f"Shape of passed values is {passed}, indices imply {implied}")
ValueError: Shape of passed values is (73236, 150000), indices imply (73236, 61817)
Here is the model setup:
model.view_anndata_setup()
Anndata setup with scvi-tools version 1.4.0.post1.
Setup via `MULTIVI.setup_anndata` with arguments:
{
│ 'rna_layer': None,
│ 'atac_layer': None,
│ 'protein_layer': None,
│ 'batch_key': None,
│ 'size_factor_key': None,
│ 'categorical_covariate_keys': None,
│ 'continuous_covariate_keys': None,
│ 'idx_layer': None,
│ 'modalities': {'rna_layer': 'rna', 'atac_layer': 'atac'}
}
Summary Statistics
┏━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃ Summary Stat Key ┃ Value ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ n_atac │ 150000 │
│ n_batch │ 1 │
│ n_cells │ 73236 │
│ n_extra_categorical_covs │ 0 │
│ n_extra_continuous_covs │ 0 │
│ n_labels │ 1 │
│ n_size_factor │ 0 │
│ n_vars │ 61817 │
└──────────────────────────┴────────┘
Data Registry
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Registry Key ┃ scvi-tools Location ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ X │ adata.mod['rna'].X │
│ atac │ adata.mod['atac'].X │
│ batch │ adata.obs['_scvi_batch'] │
│ ind_x │ adata.obs['_indices'] │
│ labels │ adata.obs['_scvi_labels'] │
└──────────────┴───────────────────────────┘
batch State Registry
┏━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┓
┃ Source Location ┃ Categories ┃ scvi-tools Encoding ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━┩
│ adata.obs['_scvi_batch'] │ 0 │ 0 │
└──────────────────────────┴────────────┴─────────────────────┘
labels State Registry
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┓
┃ Source Location ┃ Categories ┃ scvi-tools Encoding ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━┩
│ adata.obs['_scvi_labels'] │ 0 │ 0 │
└───────────────────────────┴────────────┴─────────────────────┘
And the label for mod rna and atac:
>>>md_mvi.mod['rna'].var
gene_ids feature_types
AL627309.5 AL627309.5 gene
LINC01409 LINC01409 gene
LINC01128 LINC01128 gene
LINC00115 LINC00115 gene
FAM41C FAM41C gene
... ... ...
AC008878.4 AC008878.4 gene
AC022098.2 AC022098.2 gene
AC008753.2 AC008753.2 gene
AP006621.2 AP006621.2 gene
Z97653.1 Z97653.1 gene
[61817 rows x 2 columns]
>>> md_mvi.mod['atac'].var
gene_ids feature_types
index
chr1:181000-181500 chr1:181000-181500 fragment
chr1:191000-191500 chr1:191000-191500 fragment
chr1:191500-192000 chr1:191500-192000 fragment
chr1:817000-817500 chr1:817000-817500 fragment
chr1:819000-819500 chr1:819000-819500 fragment
... ... ...
chrX:155264000-155264500 chrX:155264000-155264500 fragment
chrX:155612500-155613000 chrX:155612500-155613000 fragment
chrX:155767500-155768000 chrX:155767500-155768000 fragment
chrX:155820000-155820500 chrX:155820000-155820500 fragment
chrX:156030500-156031000 chrX:156030500-156031000 fragment
[150000 rows x 2 columns]
So it seems somehow the model is pulling atac as rna although it is setup through scvi.model.MULTIVI.setup_mudata already?