[ensembl-dev] Hyphenated entrez xrefs?

Alexander Pico apico at gladstone.ucsf.edu
Fri Dec 13 21:46:20 GMT 2013


Dear Ensembl,

I've run across a number of examples of hyphenated entrez gene identifiers in xref tables, starting back in release 72, for example:

rattus_norvegicus_core_72_5

+---------+----------------+---------------+---------------+---------+-----------------------------------------------+-----------+-----------+
| xref_id | external_db_id | dbprimary_acc | display_label | version | description                                   | info_type | info_text |
+---------+----------------+---------------+---------------+---------+-----------------------------------------------+-----------+-----------+
|  576085 |           1300 | 288264        | Ifnar1        | 0       | interferon (alpha, beta and omega) receptor 1 | DEPENDENT |           |
+---------+----------------+---------------+---------------+---------+-----------------------------------------------+-----------+-----------+
| 1143738 |           1300 | 288264-201    | Ifnar1-201    | 0       | interferon (alpha, beta and omega) receptor 1 | MISC      | via gene name |
+---------+----------------+---------------+---------------+---------+-----------------------------------------------+-----------+---------------+

The first result is accurate, but the second one is apparently manufactored. This entry breaks a number of downstream uses for xrefs, since the "-201" is not part of the official ID format for Entrez gene, for example.

What are these? Are you planning on keeping these around in future xref tables?

And how would you recommend avoiding these in xref queries using the Perl API? Here's my current Perl psuedocode:

$gene->get_all_DBLinks();
foreach my $dbe (@$db_entries) {
	if ($dbe->dbname() =~ /^\'EntrezGene\'$/){
		//Collect xref associated with $gene
	}
}
 
What other filters or checks should I do to exclude the manufactored identifiers associated with your Entrez Gene records?

Thanks!
- Alex

----------------------------------------
Alexander Pico, PhD
NRNB Executive Director
Bioinformatics Assoc. Director
Gladstone Institutes
http://nrnb.org
http://gladstoneinstitutes.org
----------------------------------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20131213/564d336b/attachment.html>


More information about the Dev mailing list