How can I export list of genes and counts for each cluster from adata.raw?

Hello,

I have an anndata object from a sample that I am processing through scanpy. I have reached the point where I did the leiden clustering. I would like to export to a csv file all the genes expressed in each cluster and the counts for each gene.

df_gene_expression = pd.DataFrame(adata.X, index=adata.obs.index, columns=adata.var.index)
df_gene_expression[‘cluster’] = adata.obs[‘leiden_0.4’]

The above seems to work but I also want to do this for the adata.raw data.
I first did the below and then run the command to create the dataframe again.

ad5 = adata.raw.to_adata()

I get this message:

ValueError: Shape of passed values is (11647, 1), indices imply (11647, 18845)

I looked at the ad5.X and it looks like this:

<11647x18845 sparse matrix of type ‘<class ‘numpy.float32’>’
with 16326262 stored elements in Compressed Sparse Row format>

How can I use it properly to export the list?

Thank you

That’s weird!

Please report this as a bug over on anndata: