Hi,
I used the following code to get biomart gene annotation for my mouse genes:
annot = sc.queries.biomart_annotations(
“mmusculus”,
[“ensembl_gene_id”, “entrez_gene_id”,“start_position”, “end_position”, “chromosome_name”],
).set_index(“ensembl_gene_id”)
annot
but got error:
KeyError Traceback (most recent call last)
File ~/ENTER/lib/python3.9/site-packages/pybiomart/dataset.py:243, in Dataset.query(self, attributes, filters, only_unique, use_attr_names)
242 try:
→ 243 attr = self.attributes[name]
244 self._add_attr_node(dataset, attr)
KeyError: ‘entrez_gene_id’
During handling of the above exception, another exception occurred:
BiomartException Traceback (most recent call last)
Input In [62], in <cell line: 1>()
----> 1 annot = sc.queries.biomart_annotations(
2 “mmusculus”,
3 [“ensembl_gene_id”, “entrez_gene_id”,“start_position”, “end_position”, “chromosome_name”],
4 ).set_index(“ensembl_gene_id”)
5 annot
File ~/ENTER/lib/python3.9/site-packages/scanpy/queries/_queries.py:108, in biomart_annotations(org, attrs, host, use_cache)
74 @_doc_params(doc_org=_doc_org, doc_host=_doc_host, doc_use_cache=_doc_use_cache)
75 def biomart_annotations(
76 org: str,
(…)
80 use_cache: bool = False,
81 ) → pd.DataFrame:
82 “”"
83 Retrieve gene annotations from ensembl biomart.
84
(…)
106 >>> adata.var[annot.columns] = annot
107 “”"
→ 108 return simple_query(org=org, attrs=attrs, host=host, use_cache=use_cache)
File ~/ENTER/lib/python3.9/site-packages/scanpy/queries/_queries.py:70, in simple_query(org, attrs, filters, host, use_cache)
66 server = Server(host, use_cache=use_cache)
67 dataset = server.marts[“ENSEMBL_MART_ENSEMBL”].datasets[
68 “{}_gene_ensembl”.format(org)
69 ]
—> 70 res = dataset.query(attributes=attrs, filters=filters, use_attr_names=True)
71 return res
File ~/ENTER/lib/python3.9/site-packages/pybiomart/dataset.py:246, in Dataset.query(self, attributes, filters, only_unique, use_attr_names)
244 self._add_attr_node(dataset, attr)
245 except KeyError:
→ 246 raise BiomartException(
247 'Unknown attribute {}, check dataset attributes ’
248 ‘for a list of valid attributes.’.format(name))
250 if filters is not None:
251 # Add filter elements.
252 for name, value in filters.items():
BiomartException: Unknown attribute entrez_gene_id, check dataset attributes for a list of valid attributes.
I don’t know how to check the dataset attributes and see whether the mmusculus dataset contains the entrez_gene_ids. Could anyone help?
This is for running the CellO cell type annotation which requires the input dataset specifies either HUGO gene symbols or Entrez gene ID’s. So dose scanpy has the function to convert gene ids to HUGO gene symbols or Entrez gene IDs?
thank you!
Ting