Read_10x_mtx error UnicodeDecodeError:

cookiemonster · March 8, 2023, 8:43am

I was in the process of creating the AnnData but when I ran the code it gave me a UnicodeDecodeError: ‘utf-8’ codec can’t decode byte 0x8b in position 1: invalid start byte.

Not sure why the error came when my file formats are correct (tsv.gz, mtx.gz). The data I’m using is from 10x’s cell/matrix raw from this link:
https://www.10xgenomics.com/resources/datasets/6-k-pbm-cs-from-a-healthy-donor-1-standard-1-1-0

data_dir = '/Users/csb/mount/scRNA/'

#create AnnData
adata_pbmc6k = sc.read_10x_mtx(data_dir, var_names = 'gene_ids', cache=True)

adata_pbmc3k.uns["name"] = "PBMC 6k"

Error message:

Even when I add the dtype=‘float64’ it still gives me error message.

yotamcons · March 16, 2023, 10:14am

It seems you are running into a pandas.read_csv error, I suggest checking that direction:

Alternatively, you can check if this repeats in other 10x’s cell/matrix raw datasets as there might be an actual problem with the file.

And as always - try updating the software and see if the issue was solved

Topic		Replies	Views
Module 'scanpy' has no attribute 'read_10X_mtx' scanpy	1	767	August 5, 2022
How to convert R matrix to anndata Help scvi , anndata	4	92	January 17, 2025
Reading in data with scanpy.read_10x_mtx gives back KeyError:1 when features.tsv contains only one column (gene symbols) and ValueError when adding 2 columns (gene ids and feature types)( scanpy	0	873	May 17, 2023
Reading matrix.mtx with "real" number format scanpy	0	357	October 29, 2023
Read_10x_h5 Error scanpy	1	811	May 23, 2023

Read_10x_mtx error UnicodeDecodeError:

Related topics