[ensembl-dev] VEP Frequency annotation

Will McLaren wm2 at ebi.ac.uk
Fri May 4 13:00:38 BST 2012


Hi Duarte,

There is a hash reference passed to the run() method in your plugin
which contains the data to be printed in the output:

sub run {
  my $self = shift;
  my $tva = shift;
  my $line_hash = shift;

  # get gene ID
  my $gene_id = $line_hash->{Gene};
}

You can edit this hash freely.

Will

On 4 May 2012 11:24, Duarte Molha <Duarte.Molha at ogt.co.uk> wrote:
> Yes Will
>
> That is what I am currently doing. I calculate an average allelic frequency (together with the minimum frequency observed and standard deviation) for any given population and then use the RSID given by the VEP script to fetch this data from my flat files.
>
> Previously I had to change the VEP script to do this. I am hoping I can change this to use only the plug-in infrastructure.
>
> Can I change the other columns in the output line using a plug-in or it only allows us to add key-value pairs to the extra column?
>
> Cheers
>
> Duarte
>
> -----Original Message-----
> From: dev-bounces at ensembl.org [mailto:dev-bounces at ensembl.org] On Behalf Of Will McLaren
> Sent: 04 May 2012 11:09
> To: Ensembl developers list
> Subject: Re: [ensembl-dev] VEP Frequency annotation
>
> Hi Duarte,
>
> You could achieve a similar thing by putting your frequencies in a BED-like or BigWig file, and load it with the VEP using --custom. See the documentation for details http://www.ensembl.org/info/docs/variation/vep/vep_script.html#custom.
>
> You could use VCFtools to calculate the frequencies, or parse them out of a VCF (if they're already there in the INFO field) using a simple perl script. I have done so with the 1000 Genomes VCF files and it works great.
>
> You could then access this frequency data through a plugin.
>
> The problem with adding the frequencies used to filter is that often there are multiple frequencies pulled from the database; we don't have one "global" frequency to use (although in future this may be possible using a global frequency from the 1000 Genomes for example), and the output may get messy.
>
> Hope this helps
>
> Will
>
> On 4 May 2012 11:00, Duarte Molha <Duarte.Molha at ogt.co.uk> wrote:
>> Dear Devs.
>>
>>
>>
>> I have another question/feature request regarding the VEP script.
>>
>>
>>
>> I know that we can use the script to filter the variations according
>> to allelic frequency. It would be very useful if one option could be
>> added so that it outputs the frequency for the allele (whatever it
>> maybe) without doing any filtering. Currently I have made a workaround
>> by preprocessing all allelic frequencies for v64 and then changed the
>> VEP script to add that information to the RSID field. Not that you
>> allow plugins I believe I can make a plugin to do this without having
>> to hack the VEP code. However it would be very usefull, since you
>> already have code in place to check for frequencies if you could
>> output the frequencies without forcing any filtering.
>>
>>
>>
>> Best regards
>>
>>
>>
>> Duarte Molha
>>
>>
>>
>>
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> List admin (including subscribe/unsubscribe):
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/




More information about the Dev mailing list