How to reorder adata according to sorted indices?

Hi,

I would like to sort the indices of my adata and reorder the adata according to the sorted indices then save it as a scanpy object. I tried the following:

adata = sc.read_h5ad(filename='result.h5ad')
nindex = adata.obs.index
nindex.sort()
adata = adata[nindex,:]
adata.write_h5ad(filename='result.h5ad')

But I get the following error:

Traceback (most recent call last):
  File "/scratch/project_2009639/scripts/pan-autoimmune/scvi/combine_10x_4.py", line 100, in <module>
    adata.write_h5ad(filename='10x_all.h5ad')
  File "/PUHTI_TYKKY_Cq2gHLh/miniconda/envs/env1/lib/python3.9/site-packages/anndata/_core/anndata.py", line 2017, in write_h5ad
    write_h5ad(
  File "/PUHTI_TYKKY_Cq2gHLh/miniconda/envs/env1/lib/python3.9/site-packages/anndata/_io/h5ad.py", line 91, in write_h5ad
    write_elem(f, "X", adata.X, dataset_kwargs=dataset_kwargs)
  File "/PUHTI_TYKKY_Cq2gHLh/miniconda/envs/env1/lib/python3.9/site-packages/anndata/_core/anndata.py", line 683, in X
    _subset(self._adata_ref.X, (self._oidx, self._vidx)),
  File "/PUHTI_TYKKY_Cq2gHLh/miniconda/envs/env1/lib/python3.9/functools.py", line 888, in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
  File "/PUHTI_TYKKY_Cq2gHLh/miniconda/envs/env1/lib/python3.9/site-packages/anndata/_core/index.py", line 168, in _subset_spmatrix
    return a[subset_idx]
  File "/PUHTI_TYKKY_Cq2gHLh/miniconda/envs/env1/lib/python3.9/site-packages/scipy/sparse/_index.py", line 77, in __getitem__
    return self._get_arrayXslice(row, col)
  File "/PUHTI_TYKKY_Cq2gHLh/miniconda/envs/env1/lib/python3.9/site-packages/scipy/sparse/_csr.py", line 216, in _get_arrayXslice
    return self._major_index_fancy(row)._get_submatrix(minor=col)
  File "/PUHTI_TYKKY_Cq2gHLh/miniconda/envs/env1/lib/python3.9/site-packages/scipy/sparse/_compressed.py", line 713, in _major_index_fancy
    csr_row_index(M, indices, self.indptr, self.indices, self.data,
ValueError: Output dtype not compatible with inputs.

Any thoughts?

I believe the call:

index.sort() should throw an error, since pandas Index’s are meant to be immutable.

E.g.:

In [1]: import pandas as pd

In [2]: pd.DataFrame({"a": [1,2,3]}).index.sort()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[2], line 1
----> 1 pd.DataFrame({"a": [1,2,3]}).index.sort()

File ~/miniforge3/envs/scanpy-dev/lib/python3.11/site-packages/pandas/core/indexes/base.py:5868, in Index.sort(self, *args, **kwargs)
   5863 @final
   5864 def sort(self, *args, **kwargs):
   5865     """
   5866     Use sort_values instead.
   5867     """
-> 5868     raise TypeError("cannot sort an Index object in-place, use sort_values instead")

TypeError: cannot sort an Index object in-place, use sort_values instead

So you may have a bugged version of pandas.

So first, I would suggest making sure all your packages are up to date. Then:

adata[adata.obs_names.sort_values(), :]

Should do what you need

1 Like

Hi @ivirshup,

Thanks. I tried adata.obs.sort_index(inplace=True) which worked. However, the resulting umap only has one color.

Any ideas?

Btw, adata[adata.obs_names.sort_values(), :] still returns ValueError: Output dtype not compatible with inputs. when writing to file.

This is sounding more like a bug. Could you open an issue on the anndata repository for this?