[ensembl-dev] Variation (VEP) 2.5 - Het/Hom Output - Plugin Attached
Sarah Hunt
seh at ebi.ac.uk
Tue May 22 11:00:51 BST 2012
Hi Ricardo,
Thanks for sharing this modified code, but I'm afraid it doesn't quite
do what you need. The variation feature allele string describes the
variant rather than any specific genotypes, so you will notice all
variants are reported as heterozygous. (You correctly observed the
genotype being extracted at line 518 of VEP.pm, but a little further
on at line 526, the alleles are used as keys in a hash which removes
redundancy before the allele string is created.)
Did you have problems running the code Will mailed? It ran fine when I
tested it using the latest API version.
There is one potential issue however. The --individual flag analyses
only positions variant in the individual(s) specified, so you will not
see hom/het statuses output for locations at which this individual is
homozyous and matches the reference sequence. Is this actually the
behaviour you require?
Best wishes,
Sarah
On Mon, May 21, 2012 at 1:55 PM, Ricardo Parolin Schnekenberg
<ricardos at well.ox.ac.uk> wrote:
> Following on the thread 2012-May/007541.html and with the kind help of
> Will McLaren, I came up with a little plugin that outputs a HomHet=HOM or
> HomHet=HET in the Extra column of the VEP output (as long as you are using
> the --individual flag).
>
> It relies on the VEP.pm detecting that the --individual flag is turned on
> and then parsing the extra column, splitting on the /:/ and getting the
> first element [0], which carries the genotype of each individual in a 0/0
> 1/1 0/1 0/1 form (And it considers the possibility of being a forward
> slash, backslash or pipe). Then it saves it as the allele_string of
> variation_feature.
>
> The plugin itself just grabs whatever data is in allele_string and splits
> the slashes/pipes and counts the number of "alleles".
>
>>2 alleles (eg: A/C/C) means homozygous (check line 518 of VEP.pm)
> <=2 alleles (e.g A/C) means heterozygous.
>
> I don't think VEP would output a line if there was no variant (so
> allele_string would be just $ref), but in that case change it accordingly.
>
> I am a medic, not a programmer so please check/recheck the code. It might
> be just totally wrong, which would be embarassing, but oh well.
>
>
> ############
> ## HOM/HET Plugin
> ## To be used with VEP 2.5 Onwards (which support the --individual option)
> ## By Ricardo Parolin Schnekenberg
> ## Wellcome Trust Centre for Human Genetics / University of Oxford
>
> package HomHet;
>
> use strict;
> use warnings;
>
> use Bio::EnsEMBL::Variation::Utils::BaseVepPlugin;
> use Bio::EnsEMBL::Variation::Utils::VEP;
>
> use base qw(Bio::EnsEMBL::Variation::Utils::BaseVepPlugin);
>
> sub new {
> my $class = shift;
>
> my $self = $class->SUPER::new(@_);
>
> die "ERROR: You must be using --individual to use this plugin\n"
> unless defined($self->{config}->{individual});
>
> return $self;
> }
>
> sub version {
> return '2.5';
> }
>
> sub feature_types {
> return ['Feature','Intergenic'];
> }
>
> sub get_header_info {
> return {
> HomHet => "Zygosity of individual genotype",
> };
> }
>
> sub run {
> my ($self, $tva) = @_;
>
> my $tv = $tva->transcript_variation;
> my $vf = $tv->variation_feature;
>
>
> my $allele_string = $vf->{allele_string};
> return {} unless defined $allele_string;
>
> my @alleles = split /\||\/|\\/, $allele_string;
>
>
> my $hom_het = scalar @alleles > 2 ? 'HOM' : 'HET';
>
>
>
>
>
> return {
> HomHet => $hom_het,
> }
> }
>
> 1;
>
>
> --
> Ricardo Parolin Schnekenberg
> Genomics Research
> Wellcome Trust Centre for Human Genetics
> University of Oxford
>
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
More information about the Dev
mailing list