[ensembl-dev] Question about Supporting evidence / Core API

Healy, Matthew Matthew.Healy at bms.com
Tue Jun 23 13:00:33 BST 2015


The simplest way I know of to obtain these mappings is to download one of the ID mapping files provided on the Uniprot web site and documented here:

ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/idmapping/README


In particular, the idmapping_selected.tab file for your species of interest gives mappings to Ensembl Gene, Transcript, and Protein identifiers.


Uniprot and Swissprot identifiers are among the available choices for External ID in Biomart on the Ensembl side, but I have found the Uniprot mappings file very convenient because it gives a large number of other identifiers that I frequently encounter.


Another extremely useful download, if you are working with human genes, is at HGNC:

http://www.genenames.org/help/download
This gives Ensembl and Uniprot identifiers, as does the Uniprot download.  The HGNC download also has some extremely useful content that is not readily available elsewhere, such as previously-approved HGNC symbols and synonyms.  I have found the synonyms and previous symbols fields from HGNC of great value for mapping lists of identifiers from sources that failed to use current approved symbols with consistency.  However, I also find that mappings based on synonyms require human checking before they can be trusted, because more than one gene may have the same synonym.


________________________________
From: dev-bounces at ensembl.org <dev-bounces at ensembl.org> on behalf of Marc P. Hoeppner <mphoeppner at gmail.com>
Sent: Tuesday, June 23, 2015 6:42 AM
To: dev at ensembl.org
Subject: [ensembl-dev] Question about Supporting evidence / Core API

Dear EnsEMBL team,

Using EnsEMBL 79, I am trying to generate a best-guess mapping between Uniprot accession numbers and EnsEMBL transcripts. First, I tried:

fetch_all_by_external_name ()

But I noticed that this was leaving a lot of Uniprot accessions unmapped. I then checked on the website and found that these Uniprot accession tend to come up as "Supporting evidence". However, trying

my $transcripts = $transcript_adaptor->fetch_all_by_transcript_supporting_evidence($accession,"protein_align_feature");

didn't seem to work. I am probably missing something about how this data is connected internally - are there feature sources other than "protein_align_feature" and "dna_align_feature", or does this connect via another adapter?

An example where this is not working for me would be Uniprot entry A0M8Q6

Kind regards,

Marc
________________________________
This message (including any attachments) may contain confidential, proprietary, privileged and/or private information. The information is intended to be for the use of the individual or entity designated above. If you are not the intended recipient of this message, please notify the sender immediately, and delete the message and any attachments. Any disclosure, reproduction, distribution or other use of this message or any attachments by an individual or entity other than the intended recipient is prohibited.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20150623/5b3c3f88/attachment.html>


More information about the Dev mailing list