[ensembl-dev] VEP Extra output information

Guillermo Marco Puche guillermo.marco at sistemasgenomicos.com
Tue Apr 16 17:00:23 BST 2013


On 04/16/13 14:49, Will McLaren wrote:
> Hi Guillermo,
>
> There's two distinct ways you can add additional data to the output
> from the VEP.
>
> 1) Custom annotations - here you simply provide the VEP with a
> tabix-indexed position-based data file, and the VEP does the work of
> finding overlaps with your variant input and the data from the file.
>
> 2) Plugins - you write the code to add to or manipulate the internal
> data structures used by the VEP. In its simplest form, a plugin can be
> simply looking up an attribute of some object and adding it to the
> output.
>
> Writing a plugin requires a basic understanding of the Ensembl API,
> but getting a basic plugin working requires only a very small amount
> of code.
Since additional data is being obtained from multiple sources, APIs, 
files, etc.. I guess plugins are the only way to go for me.
> The documentation
> (http://www.ensembl.org/info/docs/variation/vep/vep_script.html#plugins)
> explains all of this, but the best way to see how plugins work is to
> look at the existing plugins at
> https://github.com/ensembl-variation/VEP_plugins. I'd suggest looking
> at Conservation.pm and ProteinSeqs.pm as some relatively simple
> examples of retrieving additional data from the API.
Where are packages likepackage Conservation; comming from?
> You should note that using VCF output you will see repeated elements
> in the INFO field added, since the plugin gets run once for every
> variant/transcript overlap; all data appear under the CSQ field in the
> INFO column. Currently there is no way for the VEP via plugins to add
> separate INFO fields, however this is something we are looking into,
> and in fact would be relatively easy to "hack" in for someone
> determined enough (see subroutine vf_list_to_cons in
> Bio::EnsEMBL::Variation::Utils::VEP).
I'll look further into this tomorrow since I've to go now.

A workaround could be simply generating a temp file with extra columns 
and in the end merge original VCF from VEP script with the output from 
plugins for additional columns.

Maybe I missunderstood you. Correct me if i'm wrong please.
> Hope this helps, and feel free to ask further questions!
>
> Will McLaren
> Ensembl Variation
Thank you so much.

Best regards,
Guillermo.
>
> On 16 April 2013 12:58, Guillermo Marco Puche
> <guillermo.marco at sistemasgenomicos.com> wrote:
>> Hello,
>>
>> I'm in need to develop some extra features for VEP.
>>
>> My input files are in VCF format and also my output.
>>
>> But I want to add several additional columns for extra data at the VCF out.
>>
>> For example,AA conservation score, Biobase description, Biobase link, MAF
>> populations, Flanking sequence, Gene description, InterPro_ID and more..
>>
>> I've been reading the documents and I'm a bit confused about "Custom
>> annotations".
>> I think since the data I want is extra on the output and not in the input,
>> what I should do is develop several Plugins to obtain all the values I need.
>>
>> I think most of them can be obtained through the Ensembl API even if I'm new
>> to this. Other will require more hard coding.
>>
>> I hope someone can clarify me a bit on this matter.
>>
>> Thank you.
>>
>> Best regards,
>> Guillermo.
>>
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/


-- 
<http://i.imgur.com/1MjpCpe.png> *g.marco*: Informatician at Sistemas 
Genómicos S.L <#>
phone: 0034635197460 <callto:0034635197460>
web: www.sistemasgenomicos.com <http://www.sistemasgenomicos.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20130416/3dbc5aec/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 1MjpCpe.png
Type: image/png
Size: 5607 bytes
Desc: not available
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20130416/3dbc5aec/attachment.png>


More information about the Dev mailing list