[ensembl-dev] Many Ensembl-Ids lack annotation?

Colin Davenport colindaven at gmail.com
Tue Aug 16 15:55:47 BST 2011


Dear Ensembl users,

firstly, congratulations. Ensembl is a nice resource which we don't have in
the bacterial world!

I have a question about some very highly expressed genes which are lacking
annotations in the current Ensembl database v62.

I am using the edgeR bioconductor package to analyse human RNA-seq data.
Some of the most important genes in the dataset
have Ensembl IDs, but no annotation attached (see examples below,
eg. ENSG00000257107).

Are these old, too new or am I missing something here?

If I look up the gene on the Ensembl website I get. (IDHistory_gene)
Ensembl gene ENSG00000257107 is no longer in the database and has not been
mapped to any newer identifiers

In fact, the edgeR ensembl database has about 52000 entries, but the bioMart
export only gives me about 22000 entries with annotation.
Surely at least the important highly expressed genes must have been mapped
to other identifiers if they have been removed ?




Thanks for any help!
Regards,
Colin

  ENSG00000196565 HBG2 11 hemoglobin, gamma G [Source:HGNC Symbol;Acc:4832]
ENSG00000188536 HBA2 16 hemoglobin, alpha 2 [Source:HGNC Symbol;Acc:4824]
ENSG00000244734 HBB 11 hemoglobin, beta [Source:HGNC Symbol;Acc:4827]
ENSG00000206172 HBA1 16 hemoglobin, alpha 1 [Source:HGNC Symbol;Acc:4823]
ENSG00000257107

 ENSG00000255592


 ENSG00000210082


 ENSG00000211459


 ENSG00000105372 RPS19 19 ribosomal protein S19 [Source:HGNC
Symbol;Acc:10402]  ENSG00000198712 MT-CO2 MT mitochondrially encoded
cytochrome c oxidase II [Source:HGNC Symbol;Acc:7421]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20110816/7bae55dc/attachment.html>


More information about the Dev mailing list