[ensembl-dev] Variation (VEP) - Het/Hom Output

Will McLaren wm2 at ebi.ac.uk
Thu May 17 16:55:50 BST 2012


Hi Ricardo,

That's much clearer, thanks!

In 2.5 we introduced the --individual flag, which allows you to
analyse variants on a per-individual basis. So, assuming your
individual has an identifier in the VCF header row (let's say IND1),
you can ask the VEP to analyse the data taking into account the
genotype of that individual:

perl variant_effect_predictor.pl -i my_data.vcf -individual IND1

By default this will not output the HOM/HET status of the individual's
genotype for each SNP, but it would be trivial to write a plugin to do
so. See the documentation for full details
(http://www.ensembl.org/info/docs/variation/vep/vep_script.html#plugins),
but the following code should get you off to a start.

You will need to have the latest API version checked out as I have
added in a couple of fixes in the last few days in this area of code.

I will probably also add this as a feature to the "core" VEP in the
next release.

Hope this helps!

Cheers

Will

##### CODE FOLLOWS ######
# put this in ~/.vep/Plugins/HomHet.pm
# run as perl variant_effect_predictor.pl -plugin HomHet [other_options]

package HomHet;

use strict;
use warnings;

use Bio::EnsEMBL::Variation::Utils::BaseVepPlugin;

use base qw(Bio::EnsEMBL::Variation::Utils::BaseVepPlugin);

sub new {
    my $class = shift;

    my $self = $class->SUPER::new(@_);

    die "ERROR: You must be using --individual to use this plugin\n"
unless defined($self->{config}->{individual});

    return $self;
}

sub version {
    return '2.5';
}

sub feature_types {
    return ['Feature','Intergenic'];
}

sub get_header_info {
    return {
        HomHet => "Zygosity of individual genotype",
    };
}

sub run {
    my ($self, $tva) = @_;

    my $tv = $tva->transcript_variation;
    my $vf = $tv->variation_feature;

    my $gt = $vf->{genotype};
    return {} unless defined $gt;

    my %alleles = map {$_ => 1} @$gt;
    my $hom_het = scalar keys %alleles > 1 ? 'HET' : 'HOM';

    return {
        HomHet => $hom_het
    }
}

1;




On 17 May 2012 16:26, Ricardo Parolin Schnekenberg
<ricardos at well.ox.ac.uk> wrote:
> Hi Will,
>
> Sorry for not being so clear.
> I am talking about the Perl script (2.5). Here at the centre we call our
> variants with Platypus which outputs a nice VCF file in which the last
> column tells us the genotype for each variant ([0/0] 0/1 1/0 1/1). Due to
> some bugs in Annovar (which used this genotype column very well and output
> HET or HOM based on that) we decided to try using VEP for our project. It
> would be useful if VEP could do the same thing. Currently I had to write a
> script that reads the output of VEP, grabs the location of the variant,
> uses vcf tools (vcf-query) to query the vcf file for that variant and then
> returns the genotype column. It is definitely doable, but messy (and I am
> not a bioinformatics person).
>
> We need this information in order to work out models of inheritance to
> analyze whole exome sequencing data from lets say an affected proband and
> his or her two unaffacted parents.
>
> Many thanks,
>
> --
> Ricardo Parolin Schnekenberg
> Genomics Research
> Wellcome Trust Centre for Human Genetics
> University of Oxford
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/




More information about the Dev mailing list