[ensembl-dev] VEP API returning 400 for memory error

Asier Gonzalez gonzaleza at ebi.ac.uk
Wed Aug 21 15:16:33 BST 2019


Hi Anja,

This is the part of the code that manages the queries: 
https://github.com/opentargets/snp_to_gene/blob/master/snp_assignment.pl#L153-L179. 
If you look further down you will see that it cannot be a 429 code as 
they are handled explicitly. I could add repeats in case of 400 codes if 
you think that it is the best solution. I have asked about the memory 
allocation error because that is the one that gets resolved if I try 
again but there are around 50 variants that always give a 400 code even 
though they exist in Ensembl. One example is rs1057518506 which is a 
deletion according to dbSNP 
<https://www.ncbi.nlm.nih.gov/snp/rs1057518506>, is described as a 
frameshift variant in Ensembl but it lacks alleles 
<https://www.ensembl.org/Homo_sapiens/Variation/Explore?db=core;r=17:7220304-7221313;v=rs1057518506;vdb=variation;vf=371667648> 
and the API returns a 400 code saying that the length of the reference 
allele is 0 
<https://rest.ensembl.org/vep/human/id/rs1057518506?content-type=application/json>. 
I am happy to share these cases with you if you are interested in them.

I understand that using the POST endpoints would be a better solution, 
at least because we could use a single query to retrieve data about 
multiple variants thus reducing the burden on the API. However,I am 
afraid that this is a piece of code that we run twice every two months 
and it is not a priority for us to refactor it unless you have a good 
reason for me to convince my managers.

Thank you,
Asier

On 21/08/2019 14:54, Anja Thormann wrote:
> Hi Asier,
>
> I would like to take a look at your script please. I recommend for the 
> first part that you use our VEP POST endpoints for region 
> <https://rest.ensembl.org/documentation/info/vep_region_post> and id 
> <https://rest.ensembl.org/documentation/info/vep_id_post>. At this 
> point I recommend that you rerun your requests a few times on a 400 
> error and if the requests keep failing contact us with details 
> (variant id or region) of your failed requests. Can you rule out that 
> the error message has a 429 code due to too many requests?
>
> Thank you,
> Anja
>
>> On 21 Aug 2019, at 14:28, Asier Gonzalez <gonzaleza at ebi.ac.uk 
>> <mailto:gonzaleza at ebi.ac.uk>> wrote:
>>
>> Hello Anja,
>>
>> Thank you for your email. This is an script that calls both the id 
>> (/vep/human/id/) and region (/vep/human/region/) endpoints depending 
>> on whether the variant is defined by a rsID or a genomic coordinates. 
>> It calls the API thousands of times using GET requests, once per 
>> variant. I don't see any other settings but I can point you to the 
>> few code lines that control it on Github if you want to have a look 
>> yourself.
>>
>> Please let me know if I can help. I just need to know whether this is 
>> an expected behaviour as our script retries calling the API if the 
>> response is a 5XX error but it passes if it's a 400 as it should have 
>> been caused by the query and retrying should not make any difference.
>>
>> Best wishes,
>> Asier
>>
>> On 21/08/2019 13:39, Anja Thormann wrote:
>>> Dear Asier,
>>>
>>> thank you for your feedback. Could you please let me know which VEP 
>>> endpoint and settings you use? If you are using a POST endpoint, how 
>>> many variant ids or regions are you sending? I assume that the 
>>> problem happens at a point during the VEP calculation where it is 
>>> difficult to differentiate the cause of the problem. But we will 
>>> investigate further and hopefully be able to provide a more accurate 
>>> error report.
>>>
>>> Best wishes,
>>> Anja
>>>
>>>> On 21 Aug 2019, at 12:47, Asier Gonzalez <gonzaleza at ebi.ac.uk 
>>>> <mailto:gonzaleza at ebi.ac.uk>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I have an script that calls the VEP API for a few thousand variants 
>>>> and does some processing. The script captures HTTP errors and 
>>>> messages and I have found a few cases where the 400 error is 
>>>> accompanied by a  "ERROR: Cannot allocate memory" message. If I 
>>>> query the API again with the ids that produced that error I get a 
>>>> results so I understand that this is a temporary issue. I could 
>>>> handle the 400 errors further to control these cases but I wonder 
>>>> if it is an expected case or if it an issue as it sounds as this 
>>>> should be an 5XX error instead of a 400.
>>>>
>>>> Kind regards,
>>>> Asier
>>>>
>>>>
>>>> _______________________________________________
>>>> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>>>> Posting guidelines and subscribe/unsubscribe info: 
>>>> https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org
>>>> Ensembl Blog: http://www.ensembl.info/
>>>
>>> _______________________________________________
>>> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>>> Posting guidelines and subscribe/unsubscribe info: 
>>> https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org
>>> Ensembl Blog: http://www.ensembl.info/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20190821/a9ef63ba/attachment.html>


More information about the Dev mailing list