[ensembl-dev] Warning message in from VariationFeature.pm

Duarte Molha duartemolha at gmail.com
Thu Aug 1 10:39:11 BST 2013


On a related note Will

I have a plugin that I want to run for every output annotation line. It
basicaly adds the genotype fields form the VCF into as extra fields ...
It works fine for the large majority of cases but in some,
the $vf->{base_variation_feature_overlap}->{base_variation_feature}->{_line}
seems to be undefined (I've highlighted the line in question with commments)

since I require this line to extract the fields I am interested in, can you
tell me what I might be doing wrong.

Here is the code of the plugin:


###########################################
=head1 LICENSE

    Selected_VCF_fields_output
    Copyright (C) 2013  Duarte Molha

    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License
    along with this program.  If not, see <http://www.gnu.org/licenses/>.

=head1 CONTACT

 Questions may also be sent to <duartemolha at gmail.com>.

=cut

=head1 NAME

Selected_VCF_fields_output

=head1 SYNOPSIS

    mv Selected_VCF_fields_output.pm ~/.vep/Plugins
    perl variant_effect_predictor.pl -i variations.vcf --plugin
Selected_VCF_fields_output


=head1 DESCRIPTION

 This plugin retrieves the quality score fields and the genotype fields
from the input VFC and outputs them in the output tab delimited annotation
file

=cut

package Selected_VCF_fields_output;

use base qw(Bio::EnsEMBL::Variation::Utils::BaseVepPlugin);

use strict;
use warnings;

sub version {
    return '71';
}

sub new {
    my $class = shift;
    my $self = $class->SUPER::new(@_);
    return $self;
}


sub get_header_info {
    return {
"quality_score" => "Quality score from VCF input Field",
 "GT_PARAMS_AD" => "Allelic depths for the ref and alt alleles in the order
listed",
"GT_PARAMS_DP" => "Read Depth (only filtered reads used for calling)",
 "GT_PARAMS_GQ" => "Genotype Quality",
"GT_PARAMS_GT" => "Genotype",
 "GT_PARAMS_PL" => "Normalized, Phred-scaled likelihoods for AA,AB,BB
genotypes where A=ref and B=alt; not applicable if site is not biallelic",
 "GT_PARAMS_SDP" => "Raw Read Depth as reported by SAMtools",
"GT_PARAMS_RD" => "Depth of reference-supporting bases (reads1)",
 "GT_PARAMS_FREQ" => "Variant allele frequency",
"GT_PARAMS_PVAL" => "P-value from Fisher's Exact Test",
 "GT_PARAMS_RBQ" => "Average quality of reference-supporting bases (qual1)",
"GT_PARAMS_ABQ" => "Average quality of variant-supporting bases (qual2)",
 "GT_PARAMS_RDF" => "Depth of reference-supporting bases on forward strand
(reads1plus)",
 "GT_PARAMS_RDR" => "Depth of reference-supporting bases on reverse strand
(reads1minus)",
"GT_PARAMS_ADF" => "Depth of variant-supporting bases on forward strand
(reads2plus)",
 "GT_PARAMS_ADR" => "Depth of variant-supporting bases on reverse strand
(reads2minus)",
    };
}

sub feature_types {
    return ['Feature', 'Intergenic'];
}


sub run {
    my $self = shift;
    my $vf = shift;
    my $line_hash = shift;

    my $config = $self->{config};

if(defined($config->{individual}) && $config->{format} eq 'vcf') {
 my $ind_cols = $config->{ind_cols};

############################################################################################################################
 my $line =
$vf->{base_variation_feature_overlap}->{base_variation_feature}->{_line};
                     * # in this line sometimes the {_line} field is undef.
Why???*

############################################################################################################################
my $individual =
$vf->{base_variation_feature_overlap}->{base_variation_feature}->{individual};
 my @split_line = split /[\s\t]+/, $line;
my @gt_format  = split /:/, $split_line[8];
foreach my $p (@gt_format){
 $p = "GT_PARAMS_".$p ;
}
my @gt_data    = split /:/, $split_line[$ind_cols->{$individual}];
 my $results = {map { shift @gt_format => $_ } @gt_data};
$results->{"quality_score"} = $split_line[5];
 return $results;
}else{
return {};
}

}

1;
###########################################################



=========================
     Duarte Miguel Paulo Molha
         http://about.me/duarte
=========================


On Thu, Aug 1, 2013 at 9:49 AM, Duarte Molha <duartemolha at gmail.com> wrote:

> Thanks Will
>
> I should have checked that before asking :S
>
> I'll redownload and check if the error is gone ... thanks
>
> Duarte
>
>
> =========================
>      Duarte Miguel Paulo Molha
>          http://about.me/duarte
> =========================
>
>
> On Thu, Aug 1, 2013 at 9:46 AM, Will McLaren <wm2 at ebi.ac.uk> wrote:
>
>> Hi Duarte,
>>
>> I think this is a bug I've already found and fixed - can you update your
>> ensembl-variation API and try again?
>>
>> Here's the fix for reference:
>>
>>
>> http://cvs.sanger.ac.uk/cgi-bin/viewvc.cgi/ensembl-variation/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm?root=ensembl&r1=1.101.2.4&r2=1.101.2.5
>>
>> Will
>>
>>
>> On 1 August 2013 09:33, Duarte Molha <duartemolha at gmail.com> wrote:
>>
>>> I believe the main problem is that this Variation feature, for some
>>> reason does not have a splice attached to it : 'slice' => undef
>>>
>>> so the method to extract the slice and expand
>>>
>>>
>>> 471:                my $slice = $self->feature_Slice->expand(
>>> 472:                    MAX_DISTANCE_FROM_TRANSCRIPT,
>>> 473:                    MAX_DISTANCE_FROM_TRANSCRIPT
>>> 474:                );
>>>
>>> Fails.
>>>
>>> Anyone knows what might be causing this?
>>>
>>> Best regards
>>>
>>> Duarte
>>>
>>>
>>>
>>> =========================
>>>      Duarte Miguel Paulo Molha
>>>          http://about.me/duarte
>>> =========================
>>>
>>>
>>> On Wed, Jul 31, 2013 at 5:05 PM, Duarte Molha <duartemolha at gmail.com>wrote:
>>>
>>>> In an effort to understand better what might be causing this ... here
>>>> is a dumb of one such object causing the error message:
>>>>
>>>> the VCF line:
>>>> #CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT
>>>>  sample-01  sample-02  sample-03 sample-04 sample-05 sample-06
>>>> 1 I777437 I. IA IC I667.93 IPASS
>>>> IAC=1;AF=0.083;AN=12;BaseQRankSum=6.089;DP=487;Dels=0.00;FS=1.662;HRun=0;HaplotypeScore=0.9613;MQ=44.32;MQ0=49;MQRankSum=-1.435;QD=8.35;ReadPosRankSum=-1.520;SB=-247.14;set=variant2
>>>> IGT:AD:DP:GQ:PL I0/0:60,0:60:99:0,138,1819 I0/0:82,0:82:99:0,205,2614
>>>> I0/1:52,28:80:99:706,0,1290 I0/0:100,0:100:99:0,253,3074
>>>> I0/0:83,0:83:99:0,178,2360 I0/0:82,0:82:99:0,166,2135
>>>>
>>>> at line 471 of /Bio/EnsEMBL/Variation/VariationFeature.pm
>>>>
>>>> my object $self contains:
>>>>
>>>> 0  Bio::EnsEMBL::Variation::VariationFeature=HASH(0x57802b0)
>>>>    '_line' =>
>>>> "1\cI777437\cI.\cIA\cIC\cI667.93\cIPASS\cIAC=1;AF=0.083;AN=12;BaseQRankSum=6.089;DP=487;Dels=0.00;FS=1.662;HRun=0;HaplotypeScore=0.9613;MQ=44.32;MQ0=49;MQRankSum=-1.435;QD=8.35;ReadPosRankSum=-1.520;SB=-247.14;set=variant2\cIGT:AD:DP:GQ:PL\cI0/0:60,0:60:99:0,138,1819\cI0/0:82,0:82:99:0,205,2614\cI0/1:52,28:80:99:706,0,1290\cI0/0:100,0:100:99:0,253,3074\cI0/0:83,0:83:99:0,178,2360\cI0/0:82,0:82:99:0,166,2135"
>>>>    'adaptor' =>
>>>> Bio::EnsEMBL::Variation::DBSQL::VariationFeatureAdaptor=HASH(0x4b06030)
>>>>       '_is_multispecies' => ''
>>>>       'db' => Bio::EnsEMBL::Variation::DBSQL::DBAdaptor=HASH(0x52b4f08)
>>>>          '_dbc' => Bio::EnsEMBL::DBSQL::DBConnection=HASH(0x52b50d0)
>>>>             '_dbname' => 'homo_sapiens_variation_72_37'
>>>>             '_driver' => 'mysql'
>>>>             '_host' => 'ensembldb.ensembl.org'
>>>>             '_port' => 5306
>>>>             '_query_count' => 4
>>>>             '_timeout' => 0
>>>>             '_username' => 'anonymous'
>>>>             'connected32406' => 1
>>>>             'db_handle32406' => DBI::db=HASH(0x51e5ee8)
>>>>                  empty hash
>>>>             'reconnect_when_lost' => 1
>>>>          '_group' => 'variation'
>>>>          '_is_multispecies' => ''
>>>>          '_no_cache' => 1
>>>>          '_species' => 'homo_sapiens'
>>>>          '_species_id' => 1
>>>>       'dbc' => Bio::EnsEMBL::DBSQL::DBConnection=HASH(0x52b50d0)
>>>>          -> REUSED_ADDRESS
>>>>       'species_id' => 1
>>>>    'allele_string' => 'A'
>>>>    'chr' => 1
>>>>    'end' => 777437
>>>>    'existing' => ARRAY(0x10861900)
>>>>         empty array
>>>>    'genotype' => ARRAY(0x577ff80)
>>>>       0  'A'
>>>>       1  'A'
>>>>    'individual' => 'sample-01'
>>>>    'map_weight' => 1
>>>>    'non_variant' => 1
>>>>    'phased' => 1
>>>>    'slice' => undef
>>>>    'start' => 777437
>>>>    'strand' => 1
>>>>    'variation_name' => '1_777437_A'
>>>>
>>>>
>>>>
>>>>
>>>> =========================
>>>>      Duarte Miguel Paulo Molha
>>>>          http://about.me/duarte
>>>> =========================
>>>>
>>>>
>>>> On 31 July 2013 13:52, Duarte Molha <duartemolha at gmail.com> wrote:
>>>>
>>>>> Hi Devs
>>>>>
>>>>>  I have been trying to run a VCF file by the variant annotation script
>>>>> and I've been getting a warning message that I have never before
>>>>> encountered..
>>>>>
>>>>>  I was wondering if someone could let me know if it is something I am
>>>>> doing wrong…
>>>>>
>>>>>  The message is :
>>>>>
>>>>>  *Can't call method "expand" on an undefined value at
>>>>> <sic>/Bio/EnsEMBL/Variation/VariationFeature.pm line 471*
>>>>>
>>>>> **
>>>>>
>>>>> * *
>>>>>
>>>>> Here are the configuration options I am using:
>>>>>
>>>>>
>>>>>
>>>>> Configuration options:
>>>>>
>>>>>  ###
>>>>>
>>>>> allow_non_variant    1
>>>>>
>>>>> cache                1
>>>>>
>>>>> canonical            1
>>>>>
>>>>> ccds                 1
>>>>>
>>>>> check_alleles        1
>>>>>
>>>>> check_existing       1
>>>>>
>>>>> config               vep_human.ini
>>>>>
>>>>> core_type            core
>>>>>
>>>>> custom
>>>>> ./vep_additional_annotations/Somatic_variation_phenotypes.bed.gz,Somatic,bed,exact
>>>>> ./vep_additional_annotations/dbsnp135_ensembl_variation_phenotype.bed.gz,dbsnp135,bed,exact
>>>>>
>>>>> db_version           72
>>>>>
>>>>> dir                  /ReferenceData/vep_cache
>>>>>
>>>>> dir_cache            /ReferenceData/vep_cache
>>>>>
>>>>> dir_plugins          ./Plugins
>>>>>
>>>>> domains              1
>>>>>
>>>>> force_overwrite      1
>>>>>
>>>>> fork                 5
>>>>>
>>>>> gmaf                 1
>>>>>
>>>>> hgnc                 1
>>>>>
>>>>> host                 ensembldb.ensembl.org
>>>>>
>>>>> individual           all
>>>>>
>>>>> input_file           All_BOTH_SNPINDELfilter_PASSED.vcf
>>>>>
>>>>> maf_1kg              1
>>>>>
>>>>> numbers              1
>>>>>
>>>>> output_file          All_BOTH_SNPINDELfilter_PASSED.ann
>>>>>
>>>>> plugin
>>>>> OGT_NHBLI_MAF,/ReferenceData/NHLBI_EVS/NHLBI_OGT.gz
>>>>> OGT_selected_VCF_fields_output  Blosum62        Carol   OGT_Condel,b
>>>>> OGT_Grantham    TSSDistance     Downstream
>>>>>
>>>>> polyphen             b
>>>>>
>>>>> port                 5306
>>>>>
>>>>> protein              1
>>>>>
>>>>> regulatory           1
>>>>>
>>>>> sift                 b
>>>>>
>>>>> species              homo_sapiens
>>>>>
>>>>> stats                HASH(0x4370ad8)
>>>>>
>>>>> terms                SO
>>>>>
>>>>> verbose              1
>>>>>
>>>>>
>>>>>
>>>>> I would be very grateful for your help.
>>>>>
>>>>>  Duarte Molha
>>>>>
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Dev mailing list    Dev at ensembl.org
>>> Posting guidelines and subscribe/unsubscribe info:
>>> http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog: http://www.ensembl.info/
>>>
>>>
>>
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20130801/c3dbed68/attachment.html>


More information about the Dev mailing list