[ensembl-dev] REST API: Feature ID returns multiple results

Michael Heuer heuermh at gmail.com
Mon May 19 18:48:21 BST 2014


All,

For users wishing to interact with the Ensembl REST APIs via java,
I've written a partial client here

https://github.com/heuermh/ensembl-rest-client

It doesn't cover all the methods yet, but I could extend the pattern if desired.

I was waiting to implement wait-and-retry in the client because there
was mention that the throttling may be lifted as the service moves out
of Beta status.

   michael


On Mon, May 19, 2014 at 12:35 PM, Cook, Malcolm <MEC at stowers.org> wrote:
> Careful there, since, you might not get the response your application needs
> if you have “too many” queries, due to throttling at, if memory serves,
> three queries per second.
>
>
>
>
>
>
>
> From: dev-bounces at ensembl.org [mailto:dev-bounces at ensembl.org] On Behalf Of
> Saren Tasciyan
> Sent: Monday, May 19, 2014 8:02 AM
> To: dev at ensembl.org
> Subject: Re: [ensembl-dev] REST API: Feature ID returns multiple results
>
>
>
> This allows me to filter unwanted results out. I can ignore entries with
> different IDs.
> This approach seems to be working perfectly. Also my understanding for the
> usage of feature=... was a bit wrong.
>
> Thanks a lot Magali!
>
> Saren
>
> Am 19.05.2014 12:29, schrieb mag:
>
> Hi Saren,
>
> The feature look up will return all features of the given object type that
> overlap your queried region.
>
> In your example, the region is the input gene and two transcripts overlap
> this region.
>
> To retrieve the gene linked to a transcript, I would recommend using
> 'feature=transcript' instead
> Among the fields returned, there is a 'Parent' field that will specify which
> gene the transcript belongs to.
>
> http://beta.rest.ensembl.org/feature/id/ENSMUST00000163188?feature=transcript;content-type=text/xml
> <data ID="ENSMUST00000163188" Parent="ENSMUSG00000057729"
> biotype="processed_transcript" description="" end="79883041"
> external_name="Prtn3-004" feature_type="transcript"
> logic_name="havana"seq_region_name="10" source="ensembl_havana"
> start="79874476" strand="1"/>
>
> This will still return more transcripts than your input, but you will be
> able to uniquely associate each transcript to its gene.
>
>
> Hope that helps,
> Magali
>
> On 19/05/2014 10:00, Saren Tasciyan wrote:
>
> Hi,
>
> I am using REST API for my Java application, however I am stuck with a
> problem:
> I have a list of transcript IDs (ENSMUST....) from which, I want to gather
> gene IDs. So I have decided to use following endpoint:
> http://beta.rest.ensembl.org/feature/id/ENSMUST00000163188?feature=gene;content-type=text/xml
> as you can see that this returns 2 IDs:
>
> <opt>
>
> <data ID="ENSMUSG00000035835" biotype="protein_coding" description="cDNA
> sequence BC005764 [Source:MGI Symbol;Acc:MGI:2388640]" end="79874634"
> external_name="BC005764" feature_type="gene"
> logic_name="ensembl_havana_gene"seq_region_name="10" source="ensembl_havana"
> start="79860475" strand="-1"/>
>
> <data ID="ENSMUSG00000057729" biotype="protein_coding"
> description="proteinase 3 [Source:MGI Symbol;Acc:MGI:893580]" end="79883174"
> external_name="Prtn3" feature_type="gene" logic_name="ensembl_havana_gene"
> seq_region_name="10"source="ensembl_havana" start="79874476" strand="1"/>
>
> </opt>
>
>
> When queried from Ensembl website I "only" get one result:
> http://www.ensembl.org/Mus_musculus/Transcript/Summary?db=core;g=ENSMUSG00000057729;r=10:79874476-79883041;t=ENSMUST00000163188
>
> My idea, when I first ran into this problem was to use logic_name as an
> indicator, as this problem is even more frequent with ncrna, etc. However
> both results here have same logic_name.
> Correct results would be "according to my dataset" the second one. Am I
> using a wrong approach here or should I include more parameters in my query?
>
> As an example here are more queries with multiple results:
> http://beta.rest.ensembl.org/feature/id/ENSMUST00000040746?feature=gene;content-type=text/xml
> http://beta.rest.ensembl.org/feature/id/ENSMUST00000075162?feature=gene;content-type=text/xml
> http://beta.rest.ensembl.org/feature/id/ENSMUST00000080673?feature=gene;content-type=text/xml
> http://beta.rest.ensembl.org/feature/id/ENSMUST00000118639?feature=gene;content-type=text/xml
>
> Cheers,
> Saren
>
> --
> Saren Tasciyan
>
> Master Student & IT Technician at Karl Kuchler Group
> Room: 2.107
>
> Max F. Perutz Laboratories GmbH
> Dr. Bohr-Gasse 9
> A-1030 Wien
>
> T: +43-1-4277- 61812
> E: saren.tasciyan at univie.ac.at
>
>
>
>
> _______________________________________________
>
> Dev mailing list    Dev at ensembl.org
>
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
>
> Ensembl Blog: http://www.ensembl.info/
>
>
>
>
>
> _______________________________________________
>
> Dev mailing list    Dev at ensembl.org
>
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
>
> Ensembl Blog: http://www.ensembl.info/
>
>
>
> --
> Saren Tasciyan
>
> Master Student & IT Technician at Karl Kuchler Group
> Room: 2.107
>
> Max F. Perutz Laboratories GmbH
> Dr. Bohr-Gasse 9
> A-1030 Wien
>
> T: +43-1-4277- 61812
> E: saren.tasciyan at univie.ac.at
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>




More information about the Dev mailing list