[ensembl-dev] xref mapping

Genomeo Dev genomeodev at gmail.com
Wed Mar 26 22:45:47 GMT 2014


Hi Magali,

Thanks for the response.

Is there a rule for how the display name is assigned for a given Ensembl
gene ID? Something like use HGNC symbol if exists, otherwise Uniprot,
otherwise PFAM, otherwise miRBASE otherwise Havana..

The other question is: In the case of the display name of ENSG00000243485
which is the HGNC symbol is MIR1302-10, how was this one HGNC symbol chosen
from the set of four mapped HGNC symbols retrievable with the xref command?

G.


On 25 March 2014 21:16, <mr6 at ebi.ac.uk> wrote:

> Hi Genomeo,
>
> The xref endpoint returns the whole list of external references associated
> to an ensembl object.
> This can be filtered for a given external source name, in this case HGNC.
>
> The lookup endpoint returns some information on the input object,
> including its location and display name, but excluding any external
> references.
>
> For this gene, the display name is an HGNC symbol.
> This is the case for most of our genes, but in can also be
> - Uniprot names (http://beta.rest.ensembl.org/lookup/id/ENSG00000261163)
> - RFAM (http://beta.rest.ensembl.org/lookup/id/ENSG00000252365)
> - miRBase (http://beta.rest.ensembl.org/lookup/id/ENSG00000265031)
> - Havana (http://beta.rest.ensembl.org/lookup/id/ENSG00000228741)
>
>
> Hope this helps,
> Magali
>
> > Hi,
> >
> > I was comparing the output from these lookupid and xref commands from
> > ensembl REST endpoint for ENSG00000243485:
> >
> > wget -q --header='Content-type:application/json' '
> > http://beta.rest.ensembl.org/xrefs/id/ENSG00000243485?external_db=HGNC'
> > -O
> > -
> > ENSG00000243485 HGNC MIR1302-11 microRNA 1302-11 HGNC Symbol Generated
> via
> > refseq_manual DEPENDENT 38246 hsa-mir-1302-11 0
> > ENSG00000243485 HGNC MIR1302-10 microRNA 1302-10 HGNC Symbol Generated
> via
> > refseq_manual DEPENDENT 38233 hsa-mir-1302-10 0
> > ENSG00000243485 HGNC MIR1302-9 microRNA 1302-9 HGNC Symbol Generated via
> > refseq_manual DEPENDENT 38218 hsa-mir-1302-9 0
> > ENSG00000243485 HGNC MIR1302-2 microRNA 1302-2 HGNC Symbol Generated via
> > refseq_manual DEPENDENT 35294 hsa-mir-1302-2, MIRN1302-2 0
> >
> > wget -q --header='Content-type:application/json' '
> > http://beta.rest.ensembl.org/lookup/id/ENSG00000243485?expand=1' -O -
> > ENSG00000243485 1 29554 31109 1 MIR1302-10 ensembl_havana
> > ensembl_havana_lincrna microRNA 1302-10 [Source:HGNC Symbol;Acc:38233]
> > lincRNA
> >
> > What is the reason for the lookup command to show only one of the four
> > mapped HGNC  symbols?
> >
> > Thanks,
> >
> > G.
> >
> >
> > On 27 February 2014 11:20, Genomeo Dev <genomeodev at gmail.com> wrote:
> >
> >> Hi,
> >>
> >> I am interested in getting wide cross references to ensembl gene IDs. I
> >> found two programmatic ways to do that which give consistent results but
> >> different amount of details. Using ENSG00000223972 as an example:
> >> (1)
> >> Using this rest API Endpoint python code (
> >> http://beta.rest.ensembl.org/documentation/info/xref_id)
> >>
> >>
> >>    1. import httplib2, sys
> >>    2.
> >>    3. http = httplib2.Http(".cache")
> >>    4.
> >>    5. server = "http://beta.rest.ensembl.org"
> >>    6. ext = "/xrefs/id/ENSG00000157764?"
> >>    7. resp, content = http.request(server+ext, method="GET", headers={
> >>    "Content-Type":"application/json"})
> >>    8.
> >>    9. if not resp.status == 200:
> >>    10. print "Invalid response: ", resp.status
> >>    11. sys.exit()
> >>    12. import json
> >>    13.
> >>    14. decoded = json.loads(content)
> >>    15. print repr(decoded)
> >>
> >>
> >> I get:
> >>
> >>
> {"display_id":"OTTHUMG00000000961","primary_id":"OTTHUMG00000000961","version":"2","description":null,"dbname":"OTTG","synonyms":[],"info_type":"NONE","info_text":"","db_display_name":"Havana
> >> gene"}
> >>
> >>
> {"primary_id":"Hs.714157","dbname":"UniGene","ensembl_identity":98,"synonyms":[],"ensembl_start":6,"xref_start":1,"xref_end":1639,"db_display_name":"UniGene","display_id":"Hs.714157","ensembl_end":1657,"version":"0","score":8055,"cigar_line":"1200M1D299M12D140M","description":"DEAD/H
> >> (Asp-Glu-Ala-Asp/His) box helicase 11 like
> >>
> 1","xref_identity":97,"evalue":null,"info_text":"","info_type":"SEQUENCE_MATCH"}
> >>
> >>
> {"primary_id":"Hs.618434","dbname":"UniGene","ensembl_identity":58,"synonyms":[],"ensembl_start":669,"xref_start":1,"xref_end":974,"db_display_name":"UniGene","display_id":"Hs.618434","ensembl_end":1655,"version":"0","score":4757,"cigar_line":"537M1D299M12D138M","description":"Similar
> >> to DEAD/H (Asp-Glu-Ala-Asp/His) box polypeptide 11 isoform 1, mRNA (cDNA
> >> clone
> >>
> IMAGE:6103207)","xref_identity":96,"evalue":null,"info_text":"","info_type":"SEQUENCE_MATCH"}
> >>
> >>
> {"display_id":"DDX11L1","primary_id":"37102","version":"0","description":"DEAD/H
> >> (Asp-Glu-Ala-Asp/His) box helicase 11 like
> >>
> 1","dbname":"HGNC","synonyms":[],"info_type":"DIRECT","info_text":"Generated
> >> via ensembl_manual","db_display_name":"HGNC Symbol"}
> >>
> >>
> {"display_id":"DDX11L5","primary_id":"100287596","version":"0","description":"DEAD/H
> >> (Asp-Glu-Ala-Asp/His) box helicase 11 like
> >>
> 5","dbname":"EntrezGene","synonyms":[],"info_type":"DEPENDENT","info_text":"","db_display_name":"EntrezGene"}
> >>
> >>
> {"display_id":"DDX11L1","primary_id":"100287102","version":"0","description":"DEAD/H
> >> (Asp-Glu-Ala-Asp/His) box helicase 11 like
> >>
> 1","dbname":"EntrezGene","synonyms":[],"info_type":"DEPENDENT","info_text":"","db_display_name":"EntrezGene"}
> >>
> >>
> >>
> {"display_id":"ENSG00000223972","primary_id":"ENSG00000223972","version":"0","description":"","dbname":"ArrayExpress","synonyms":[],"info_type":"DIRECT","info_text":"","db_display_name":"ArrayExpress"}
> >>
> >>
> {"display_id":"DDX11L5","primary_id":"100287596","version":"0","description":"DEAD/H
> >> (Asp-Glu-Ala-Asp/His) box helicase 11 like
> >>
> 5","dbname":"WikiGene","synonyms":[],"info_type":"DEPENDENT","info_text":"","db_display_name":"WikiGene"}
> >>
> >>
> {"display_id":"DDX11L1","primary_id":"100287102","version":"0","description":"DEAD/H
> >> (Asp-Glu-Ala-Asp/His) box helicase 11 like
> >>
> 1","dbname":"WikiGene","synonyms":[],"info_type":"DEPENDENT","info_text":"","db_display_name":"WikiGene"}]
> >>
> >> (2)
> >>
> >> Using this perl API code (based on
> >> http://www.ensembl.org/info/docs/api/core/core_tutorial.html):
> >>
> >> # Define a helper subroutine to print DBEntries
> >> sub print_DBEntries
> >> {
> >>     my $db_entries = shift;
> >>
> >>     foreach my $dbe ( @{$db_entries} ) {
> >>         printf "\tXREF %s (%s)\n", $dbe->display_id(), $dbe->dbname();
> >>     }
> >> }
> >>
> >> my $genes = $gene_adaptor->fetch_all_by_stable_id_list([@gene_list]);
> >>
> >>
> >> ...
> >>
> >>
> >> print "GENE ", $gene->stable_id(), "\n";
> >> print_DBEntries( $gene->get_all_DBEntries() );
> >>
> >> I get:
> >> XREF OTTHUMG00000000961 (OTTG)
> >> XREF ENSG00000223972 (ArrayExpress)
> >> XREF DDX11L1 (EntrezGene)
> >> XREF DDX11L5 (EntrezGene)
> >> XREF DDX11L1 (HGNC)
> >> XREF Hs.618434 (UniGene)
> >> XREF Hs.714157 (UniGene)
> >> XREF DDX11L1 (WikiGene)
> >> XREF DDX11L5 (WikiGene)
> >>
> >>
> >> Questions:
> >>
> >> 1. am I correct in saying that the Rest code uses the latest Ensembl
> >> release while the API code uses the Ensembl release currently installed
> >> as
> >> part of the VM (I am using release 74)?
> >>
> >> 2. Rest code gives more extensive details (which I like) compared to the
> >> perl API code. Could you suggest a simple way to use the API to get the
> >> same details?
> >>
> >> 3. The Rest code output format. Is tab separated text supported?
> >>
> >> 4. Is there a  file in the Ensembl ftp area which contains pre generated
> >> detailed cross ref mappings for all current Ensembl genes?
> >> --
> >>
> >> Thanks,
> >>
> >> G.
> >>
> >
> >
> >
> > --
> > G.
> > _______________________________________________
> > Dev mailing list    Dev at ensembl.org
> > Posting guidelines and subscribe/unsubscribe info:
> > http://lists.ensembl.org/mailman/listinfo/dev
> > Ensembl Blog: http://www.ensembl.info/
> >
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>



-- 
G.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20140326/cdc69cff/attachment.html>


More information about the Dev mailing list