Hi single-cell connoisseur,
I am curious if the default type for AnnData raw object is float or int?
For example:
raw_adata = adata.raw.to_adata()
raw_adata.X
Output:
<220752x29484 sparse matrix of type '<class 'numpy.float64'>'
with 307103988 stored elements in Compressed Sparse Row format>
Of course, I got this object for a published dataset, and have to restore the raw count. Just curious if raw counts are stored by default as float?
The motivation for this question was because I saw this cute way of checking if the data is normalized:
if check_counts:
# check if observations are unnormalized using first 10
X_subset = adata.X[:10]
norm_error = 'Make sure that the dataset (adata.X) contains unnormalized count data.'
if sp.sparse.issparse(X_subset):
assert (X_subset.astype(int) != X_subset).nnz == 0, norm_error
else:
assert np.all(X_subset.astype(int) == X_subset), norm_error
Certainly, my case would have failed this check, if the sparse matrix is not by default stored in int.
Thank you.
Wil