[ensembl-dev] 2 questions

William McLaren wm2 at ebi.ac.uk
Mon Aug 7 09:22:55 BST 2017

Hi Matthew,

On 4 August 2017 at 22:33:12, Maher, Matthew (matthew_maher at meei.harvard.edu) wrote:

1. Is this the correct forum for asking a question about VEP 
functionality? if not, what is? 
Yes, it is!

2. In VEP, with --vcf output in use, invoking --custom annotations 
seems to result in the new/custom values being added as new positional 
values within the CSQ/ANN structure - and that structure can have any # 
of value sets (possibly zero? I'm not sure). The result of this 
placement seems to be the new/custom annotations being unnecessarily 
repeated (since they apply to the position, not the specific transcript 
annotations) and potentially missed completely (in the case of no CSQ 
entries). Is there an option (I can't find it) to cause the new/custom 
annotation to appear as a new standalone KEY=[VALUE,VALUE,...] entry in 
the INFO field (outside of the CSQ entry), with a VCF header entry 
indicating "Number=A" (one value per alternate allele)? 
—custom annotations are added to the CSQ/ANN structure. There’s currently no way to force VEP to write them as separate INFO fields. How many fields added depends on the file format of your custom annotations and any options you select. The fields added are described in the VCF header.

For GFF and GTF files, the gene models are treated the same as other gene models loaded from the cache or database, so are used to calculate consequences etc and will result in a full “block” of annotation in the VCF output for each overlapped transcript. No specific fields are added other than the SOURCE field being set to the short name (or filename).

For BED and bigWig files, only one field is added. For bigWig this is the recorded score at that position. For BED this will be either the 4th column of the BED (usually some sort of ID), or if this is absent (or the report_coords flag is set), the coordinates of the feature.

For VCF files, the minimal behaviour is similar to BED, but users may optionally request additional fields from the VCF INFO field to be reported. These are, if available in this format, reported in an allele-specific manner. This allows, for example, VEP to get per-allele frequency data from gnomAD VCF files.

Hope that helps!

Will McLaren

Ensembl Variation

Thank You All 

very nice tool, VEP! 

Mass. Eye and Ear Confidentiality Notice: This e-mail and any files transmitted with it are confidential and are intended solely for the use of the individual(s) addressed in the message above. This communication may contain sensitive or confidential information. If you are not an intended recipient, dissemination, forwarding, printing, or copying of this e-mail is strictly prohibited. If you believe you have received this e-mail in error and the email contains patient information, please contact the Mass. Eye and Ear Compliance Line at 844-815-4401. If the e-mail was sent to you in error but does not contain patient information, please contact the sender and delete the e-mail. 
Dev mailing list Dev at ensembl.org 
Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev 
Ensembl Blog: http://www.ensembl.info/ 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20170807/0845056d/attachment.html>

More information about the Dev mailing list