[ensembl-dev] Ensembl IDs versus HGNC

mag mr6 at ebi.ac.uk
Thu Feb 20 16:18:05 GMT 2014


Hi Genomeo,

HGNC data is manually curated, so HGNC curators check a locus and assign 
the corresponding ensembl entry.
As each entry is manually curated, not all ensembl mappings are 
necessarily available.
It does mean though that HGNC can be updated in permanence.

In Ensembl, we typically update those mappings every release, as the 
human gene set is updated every release.
We assign HGNC IDs using direct mappings from HGNC.
These are complemented by indirect mappings, via Uniprot or RefSeq.
If a Uniprot entry is mapped to an ensembl entry and that same Uniprot 
entry is mapped in HGNC to an HGNC symbol, the HGNC symbol is assigned 
to the ensembl entry.

So there are more HGNC-ensembl ID links in Ensembl than they are in HGNC.

What can also happen is that our ensembl stable ID changes between 
releases due to massive changes in the underlying sequence.
For those cases, we will not be able to get the direct mapping from HGNC.
We might still be able to keep the same name for the gene thanks to the 
two-step mappings via RefSeq or Uniprot.
We then feed those cases back to HGNC for them to update their records 
if they agree with the replacement.

I am unsure on how NCBI assigns mappings to Ensembl, they could be 
importing the mappings from us directly or generate their own mappings.

I hope this answers most of your questions.


Regards,
Magali

On 20/02/2014 11:43, Genomeo Dev wrote:
> Hi,
>
> I have a set of ~ 6000 Ensembl IDs which I want to map to HGNC IDs. I 
> am faced with the following situation:
>
> Based on Ensembl Biomart or Ensembl Rest, there are ~ 4000 of these 
> that have HGNC IDs.
>
> Based on HGNC biomart, there are ~ 3000 which have HGNC IDs. HGNC DB 
> mention that themselves use mapping supplied by Ensembl.
>
> The IDs mapped from each of these sources are not always the same.
>
> Questions:
>
> - What is causing the different level of coverage?
> - What is causing the differences in specific mapping if all of it is 
> done by Esembl?
> - How often does this mapping change at any of these sources?
> - How do other sources like NCBI assign Ensembl IDs to their Entrez IDs?
> - What is the best way of getting HGNC IDs for Ensembl IDs? from 
> Ensembl or HGNC DB?
>
> Thanks!
>
> -- 
> G.
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20140220/7377238a/attachment.html>


More information about the Dev mailing list