Cellxgene datasets raw data? scaled?

Hi!

First, I would just like to know how were the .X matrices in the .h5ad files that can be downloaded from Datasets - CZ CELLxGENE Discover made ? What is the origin of those floats?

I finally found that the raw data is under adata._raw.X

Second, is there any accessible AWS or gcloud storage where the raw counts are located? Something like some collection of .h5ad files i.e dataset_id.h5ad. I refer to the datasets with the raw counts generated after the cellxgene filtering pipeline for removing low counts cells etc.

I am aware of using the python API to download some of the .h5ad files , however sometimes it is very slow/gets stuck etc.

Thank you in advanced for your reply,

Best

Hi @LysSanzMoreta,

I’m afraid I don’t know the answers to your questions but I wanted to point out that adata._raw is a private element so the API might change. There is adata.raw which currently just returns adata._raw but is meant as part of the stable API. So I suggest you use adata.raw.X to retrieve the raw counts rather than adata._raw.X.

Thanks anyways!

Strange .raw is not found in the keys, that is why I had to dig into the private keys