[ensembl-dev] Understanding VEP Rest output

Anja Thormann anja at ebi.ac.uk
Thu Apr 12 16:02:21 BST 2018


Dear Beat,

you cannot disable the transcript annotations from the vep output. However, we know that you are describing a very common use case and we are planning to provide an endpoint that combines both calls into one call which returns variants in a region together with the variant annotations including allele frequencies later this year.

HTH,
Anja

> On 11 Apr 2018, at 08:05, Wolf Beat <Beat.Wolf at hefr.ch> wrote:
> 
> Thank you very much for the fast answer and fix to the problem. Also thank you for the heads up for the API changes in the next version.
> 
> 
> I do have a small request though. Is there a way to disable the whole transcript consequences part of the REST VEP api? I know that sounds counter intuitive, but i'm only interested in the allele frequencies.
> 
> 
> While we are on the subject of requests, i could further optimize my code and reduce the stress on the REST server by being able to combine two of my queries.
> 
> 
> Currently i'm first searching for variants overlapping a certain region (where my variants of interest are) using the overlap/region entry point.
> 
> I then determine the ID of all my variants and then querry the VEP endpoint to get the MAF numbers.
> 
> If there was an optional way to get the MAF numbers directly with the overlap endpoint, i would not have to query the VEP endpoint.
> 
> 
> Kind regards
> 
> 
> Beat Wolf
> 
> ________________________________
> From: Dev <dev-bounces at ensembl.org <mailto:dev-bounces at ensembl.org>> on behalf of Anja Thormann <anja at ebi.acuk <mailto:anja at ebi.acuk>>
> Sent: Tuesday, April 10, 2018 6:11:34 PM
> To: Ensembl developers list
> Subject: Re: [ensembl-dev] Understanding VEP Rest output
> 
> 
> Dear Beat,
> 
> First of all thank you for reporting the problem about the strange allele frequencies. This has been corrected. Please check again the REST output: http://rest.ensembl.org/vep/hsapiens/id/rs351771?ExAC=1&content-type=application/json
> 
> Secondly, we recently replaced the usage of _maf with _af because of the exact reasons for confusion which you mention. We are also no longer reporting the minor allele and and minor allele frequency with the VEP.
> 
> Unfortunately, we overlooked the usage of maf in our REST output.
> 
> We will update the REST output to match the output format of the VEP script output. This is planned for release/94.
> 
> To summarise: When annotating a variant with the VEP, the VEP reports allele frequencies from co-located variants. Frequencies are only reported for the non-reference input allele.
> 
> The new keys in the REST output will be:
> 
> AF (global allele frequency (AF) from 1000 Genomes Phase 3)
> MAX_AF, MAX_AF_POPS (Report the highest allele frequency observed in any population from 1000 genomes, ESP or gnomAD.)
> AFR_AF, AMR_AF, EAS_AF, EUR_AF, SAS_AF (allele frequency from continental populations (AFR,AMR,EAS,EUR,SAS) of 1000 Genomes Phase 3)
> AA_AF, EA_AF (allele frequency from NHLBI-ESP)
> gnomAD_AF, gnomAD_AFR_AF, gnomAD_AMR_AF, gnomAD_ASJ_AF, gnomAD_EAS_AF, gnomAD_FIN_AF, gnomAD_NFE_AF, gnomAD_OTH_AF, gnomAD_SAS_AF (allele frequency from Genome Aggregation Database (gnomAD))
> 
> Here is a more detailed description of frequency related output fields: https://www.ensembl.org/info/docs/tools/vep/script/vep_options.html#existing
> 
> For now please don’t be confused by the usage of maf in our REST output.
> 
> minor_allele_freq and minor_allele: refer to the frequency of the second most common allele at the position where a sequence variant (such as a SNP) has been identified. In Ensembl, the global MAF is calculated using the allele frequencies across all 1000 Genomes Phase 3 populations.
> 
> population_maf and population_allele match the input non-reference allele and report the allele frequency as reported for the respective population
> 
> 
> Please let me know if you have any further questions,
> 
> Kind regards,
> Anja
> 
> On 10 Apr 2018, at 12:35, Wolf Beat <Beat.Wolf at hefr.ch <mailto:Beat.Wolf at hefr.ch><mailto:Beat.Wolf at hefr.ch <mailto:Beat.Wolf at hefr.ch>>> wrote:
> 
> Hi, i have a question out the minor_allele field of the colocated_variants field in the VEP response.
> 
> 
> for rs351771 (G>A) i don't fully understand the logic of the answer:
> 
> 
> http://rest.ensembl.org/vep/hsapiens/id/rs351771?ExAC=1&content-type=application/json
> 
> 
> What i have trouble understanding is the minor_allele for 1000 genomes (thats minor_allele i think) and for the others.
> 
> For all, except minor_alelle, the minor allele is "A". But for minor_allele its "G".
> 
> 
> From what i can guess, the variant is a little borderline between a variant and a false reference, meaning, the reference sequence might actually be the minor_allele.
> 
> 
> But what i don't understand are the numbers from gnomad etc. reported by VEP. If i read the http://gnomad.broadinstitute.org/variant/5-112164561-G-A website correctly, the frequency of A should be quite high, 50%+. Yet VEP tells me that the frequency of A in gnomad is way under 1%.
> 
> 
> Am I missing something here? Can somebody help me to better understand the output of VEP?
> 
> 
> Kind regards
> 
> 
> Beat Wolf
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org <mailto:Dev at ensembl.org>
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev <http://lists.ensembl.org/mailman/listinfo/dev>
> Ensembl Blog: http://www.ensembl.info/ <http://www.ensembl.info/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20180412/6e055852/attachment.html>


More information about the Dev mailing list