[ensembl-dev] invalid vcf output

Will McLaren wm2 at ebi.ac.uk
Mon Jul 22 09:26:12 BST 2013


Hi Reece,

Thanks for pointing this out, this could catch out parsers if they split on
the "=".

There's a couple of things that get converted when being pushed to VCF (","
becomes "&"), I missed this.

I'm not sure what would be best to do here - I could pick a character to
swap it to, or perhaps something like "eq" instead?

Do any other VEP users have a preference?

Regards

Will McLaren
Ensembl Variation


On 22 July 2013 05:20, Reece Hart <reece at harts.net> wrote:

> Hi-
>
> VEP 72 reports protein consequences that contain an equal sign in the CSQ
> INFO section. This violates the VCF spec, which says "INFO additional
> information: (String, no white-space, semi-colons, or equals-signs
> permitted;" (http://goo.gl/R0C1U)
>
> Example:
> variant_effect_predictor.pl  --database --vcf -o - --hgvs
>
> with the variant ENST00000341065.4:c.1163G>C
>
> returns a record that contains
>  ... ENST00000341065.4:c.1163G>C|ENST00000341065.4:c.1163G>C(p.=)| ...
>
> I don't know whether there is an escaping mechanism for the INFO sections,
> so I'm not sure what should be done about this.
>
> Thanks,
> Reece
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20130722/b62f6ae6/attachment.html>


More information about the Dev mailing list