[ensembl-dev] Ensembl REST API | GET map/cdna/:id/:region

Ramiro Magno ramiro.magno at gmail.com
Tue Sep 24 17:13:12 BST 2019


Hi Ameya,

Thank you for your quick response!

But I still have a few questions regarding your reply.

*Question 7 (Q7)*

Firstly, my question is only about the REST API so I can't really make use
of these other suggestions of yours regarding the Perl API or the "read
only connection to the database".

When it comes to Ensembl's REST API endpoint
"map/:species/:asm_one/:region/:asm_two" there are several parameters
available, one of them is "coord_system". For someone with some prior
knowledge on genome assembly concepts it is easy to deduce that this
parameter must accept one value out of an enumeration of possible choices.
So my question is what are these choices? I understand I can make queries
as you suggested and infer from the responses' output the possible values.
However, this approach is prone to be not exhaustive, and it does not tell
me how it generalises to other species than human. Given that this fine
detailed level of documentation of the REST API is not yet done, we, as
users, have to guess what values these take. In the doc, only example
values are given
(https://rest.ensembl.org/documentation/info/assembly_map) which
already qualifies as a schema :)

*Question 8 (Q8)*

Again, here, I don't really understand your answer, I am sorry. I will try
to be more explicit.

When the queried region is seemingly not mappable, the json object
structure as given in the response is not the same as when we perform a
bona fide query, i.e. a query with a region that is mappable to the genome.

My suggestion is to return the same type of json object in all
circumstances, regardless of whether the region is mappable or not. In my
previous email, I made a few extra suggestions on possible values for the
keys in the json object when the queried region does not yield a new mapped
region.

So I don't quite understand why you say this equates to a "generic and
larger request". As far as I understand this has only to do with this
specific endpoint. Moreover, you also mention "http code", and I don't
understand how the response code is related to my suggestions. These would
still be all successful responses, as far as I understand.

*Liftover question*

Another kind request: could you please also answer my other question, sent
as a separate email whose title is "Ensembl REST API |
map/:species/:asm_one/:region/:asm_two" about a possible bug in the
liftover?

Looking forward to hearing from you!

RM



On Tue, 24 Sep 2019 at 15:19, Ameya Chaubal <ameya at ebi.ac.uk> wrote:

> Hi Ramiro,
>
> Thanks for the follow-up. Here are couple of thoughts on your queries:
>
>
> Q8) This is more of a generic and larger request which could potentially
> lead to re-looking at most of the end-points. e.g.: Going by REST
> standards, a correct http code needs to be sent with each response. So this
> might be looked at a later stage.
> Q7) There are couple of  options to get this data:
>
> - Using REST endpoint:
> http://rest.ensembl.org/info/assembly/homo_sapiens?content-type=application/json
> The top_level_region key has a list of coord_system which needs to be made
> unique
>
> - Use Perl API:
> Refer to the script given under ‘Coordinate Systems & Slices’ over here :
> https://www.ebi.ac.uk/training/online/sites/ebi.ac.uk.training.online/files/u1218/Core3.pdf
>
> - Use the read only connection to the database and get it from 'coord_system'
> table
>
> Thanks,
> Ameya
>
>
> On 24 Sep 2019, at 10:18, Ramiro Magno <ramiro.magno at gmail.com> wrote:
>
> Hi Devs,
>
> When can I expect an update on these questions (Q7 and Q8)?
>
> Thanks a lot!
>
> Cheers, RM
>
> On Fri, 13 Sep 2019 at 17:53, Ramiro Magno <ramiro.magno at gmail.com> wrote:
>
>> Thank you Brandon for the information!
>>
>> On Fri, 13 Sep 2019 at 17:38, Brandon Walts <bwalts at ebi.ac.uk> wrote:
>>
>>> Hi Ramiro
>>>
>>> Apologies for the delay. We have some answers for you now, and more will
>>> be forthcoming:
>>>
>>> Q1: This endpoint should work for most species. I have seen the errors
>>> that sometimes arise, and we think suspect (but have not confirmed) that
>>> there may be a bug in the  TranscriptMapper that supports this in the
>>> underlying Perl API. We will look into this further.
>>>
>>> Q2: No, but this is a good suggestion and we will look into possibly
>>> implementing this.
>>>
>>> Q3: Yes, only transcript IDs are valid for this endpoint.
>>>
>>> Q4 and Q5: For the first part of your question, basically yes. Gap
>>> represents a gap in the sequence. We will look into whether setting the
>>> "overrun" to a gap is the desired behaviour, or whether there may be a
>>> better way to return a result for a query that goes beyond the end of a
>>> transcript.
>>>
>>> Best
>>> -Brandon
>>> On 10/09/2019 14:21, Ramiro Magno wrote:
>>>
>>> Hi Devs,
>>>
>>> I have a few questions about the endpoint "map/cdna/:id/:region".
>>>
>>> Q1: How can I find what species work with this endpoint?
>>>
>>> Q2: Is there a way of specifying the "range" parameter so that it starts
>>> at 1 and goes till the end of the cDNA sequence? E.g., 1..end?
>>>
>>> Q3: In the doc, the description of the "id" parameter reads "An Ensembl
>>> stable ID". But only transcript ids are valid choices, correct?
>>> For example, making the nonsensical query with a gene id gives back a not
>>> so tidy error:
>>> https://rest.ensembl.org/map/cdna/ENSG00000115263/1..1000?content-type=application/json;include_original_region=1
>>> .
>>>
>>>
>>> Q4: What is the meaning of the variable "gap" in the json response? Is
>>> it gap=0 means exon, and gap=1 un-mappable region? When I exceed the end
>>> position of the transcript in the "range" parameter I always get the excess
>>> as a region flagged as gap=1.
>>>
>>>
>>> Q5: When I get an excess region as a result of asking a cDNA sequence
>>> that goes beyond the length of transcript, why are the genomic coordinates
>>> "start" and "end" reported as if they were in cdna coordinate system? E.g.,
>>> the request:
>>> https://rest.ensembl.org/map/cdna/ENST00000375497/1..1000?content-type=application/json;include_original_region=1 yields
>>> a last block whose coordinates are start=800, end=1000 for both coordinate
>>> systems: "cdna" and "chromosome". Would it not make more sense to just
>>> report NA or Null instead?
>>>
>>>
>>> Q6: What is the meaning of the "rank" variable in the json output?
>>>
>>>
>>> Many thanks in advance!
>>>
>>> Cheers, Ramiro
>>>
>>> _______________________________________________
>>> Dev mailing list    Dev at ensembl.org
>>> Posting guidelines and subscribe/unsubscribe info: https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org
>>> Ensembl Blog: http://www.ensembl.info/
>>>
>>> _______________________________________________
>>> Dev mailing list    Dev at ensembl.org
>>> Posting guidelines and subscribe/unsubscribe info:
>>> https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org
>>> Ensembl Blog: http://www.ensembl.info/
>>>
>> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org
> Ensembl Blog: http://www.ensembl.info/
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20190924/cbda8048/attachment.html>


More information about the Dev mailing list