[ensembl-dev] Warning message in from VariationFeature.pm

Will McLaren wm2 at ebi.ac.uk
Fri Aug 2 15:10:10 BST 2013


Hi Duarte,

I'm not sure why this would be, I doubt it's anything to do with the
specific code in your plugin - that looks fine.

Are you using --fork? I think another user had a similar issue a few
releases ago which I thought I had solved. Does the problem occur with and
without --fork?

Will


On 1 August 2013 10:39, Duarte Molha <duartemolha at gmail.com> wrote:

> On a related note Will
>
> I have a plugin that I want to run for every output annotation line. It
> basicaly adds the genotype fields form the VCF into as extra fields ...
> It works fine for the large majority of cases but in some,
> the $vf->{base_variation_feature_overlap}->{base_variation_feature}->{_line}
> seems to be undefined (I've highlighted the line in question with commments)
>
> since I require this line to extract the fields I am interested in, can
> you tell me what I might be doing wrong.
>
> Here is the code of the plugin:
>
>
> ###########################################
> =head1 LICENSE
>
>     Selected_VCF_fields_output
>     Copyright (C) 2013  Duarte Molha
>
>     This program is free software: you can redistribute it and/or modify
>     it under the terms of the GNU General Public License as published by
>     the Free Software Foundation, either version 3 of the License, or
>     (at your option) any later version.
>
>     This program is distributed in the hope that it will be useful,
>     but WITHOUT ANY WARRANTY; without even the implied warranty of
>     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>     GNU General Public License for more details.
>
>     You should have received a copy of the GNU General Public License
>     along with this program.  If not, see <http://www.gnu.org/licenses/>.
>
> =head1 CONTACT
>
>  Questions may also be sent to <duartemolha at gmail.com>.
>
> =cut
>
> =head1 NAME
>
> Selected_VCF_fields_output
>
> =head1 SYNOPSIS
>
>     mv Selected_VCF_fields_output.pm ~/.vep/Plugins
>     perl variant_effect_predictor.pl -i variations.vcf --plugin
> Selected_VCF_fields_output
>
>
> =head1 DESCRIPTION
>
>  This plugin retrieves the quality score fields and the genotype fields
> from the input VFC and outputs them in the output tab delimited annotation
> file
>
> =cut
>
> package Selected_VCF_fields_output;
>
> use base qw(Bio::EnsEMBL::Variation::Utils::BaseVepPlugin);
>
> use strict;
> use warnings;
>
> sub version {
>     return '71';
> }
>
> sub new {
>     my $class = shift;
>     my $self = $class->SUPER::new(@_);
>     return $self;
> }
>
>
> sub get_header_info {
>     return {
> "quality_score" => "Quality score from VCF input Field",
>  "GT_PARAMS_AD" => "Allelic depths for the ref and alt alleles in the
> order listed",
> "GT_PARAMS_DP" => "Read Depth (only filtered reads used for calling)",
>  "GT_PARAMS_GQ" => "Genotype Quality",
> "GT_PARAMS_GT" => "Genotype",
>  "GT_PARAMS_PL" => "Normalized, Phred-scaled likelihoods for AA,AB,BB
> genotypes where A=ref and B=alt; not applicable if site is not biallelic",
>  "GT_PARAMS_SDP" => "Raw Read Depth as reported by SAMtools",
> "GT_PARAMS_RD" => "Depth of reference-supporting bases (reads1)",
>  "GT_PARAMS_FREQ" => "Variant allele frequency",
> "GT_PARAMS_PVAL" => "P-value from Fisher's Exact Test",
>  "GT_PARAMS_RBQ" => "Average quality of reference-supporting bases
> (qual1)",
> "GT_PARAMS_ABQ" => "Average quality of variant-supporting bases (qual2)",
>  "GT_PARAMS_RDF" => "Depth of reference-supporting bases on forward
> strand (reads1plus)",
>  "GT_PARAMS_RDR" => "Depth of reference-supporting bases on reverse
> strand (reads1minus)",
> "GT_PARAMS_ADF" => "Depth of variant-supporting bases on forward strand
> (reads2plus)",
>  "GT_PARAMS_ADR" => "Depth of variant-supporting bases on reverse strand
> (reads2minus)",
>     };
> }
>
> sub feature_types {
>     return ['Feature', 'Intergenic'];
> }
>
>
> sub run {
>     my $self = shift;
>     my $vf = shift;
>     my $line_hash = shift;
>
>     my $config = $self->{config};
>
> if(defined($config->{individual}) && $config->{format} eq 'vcf') {
>  my $ind_cols = $config->{ind_cols};
>
> ############################################################################################################################
>  my $line =
> $vf->{base_variation_feature_overlap}->{base_variation_feature}->{_line};
>                      * # in this line sometimes the {_line} field is
> undef. Why???*
>
> ############################################################################################################################
> my $individual =
> $vf->{base_variation_feature_overlap}->{base_variation_feature}->{individual};
>  my @split_line = split /[\s\t]+/, $line;
> my @gt_format  = split /:/, $split_line[8];
> foreach my $p (@gt_format){
>  $p = "GT_PARAMS_".$p ;
> }
> my @gt_data    = split /:/, $split_line[$ind_cols->{$individual}];
>  my $results = {map { shift @gt_format => $_ } @gt_data};
> $results->{"quality_score"} = $split_line[5];
>  return $results;
> }else{
> return {};
> }
>
> }
>
> 1;
> ###########################################################
>
>
>
> =========================
>      Duarte Miguel Paulo Molha
>          http://about.me/duarte
> =========================
>
>
> On Thu, Aug 1, 2013 at 9:49 AM, Duarte Molha <duartemolha at gmail.com>wrote:
>
>> Thanks Will
>>
>> I should have checked that before asking :S
>>
>> I'll redownload and check if the error is gone ... thanks
>>
>> Duarte
>>
>>
>> =========================
>>      Duarte Miguel Paulo Molha
>>          http://about.me/duarte
>> =========================
>>
>>
>> On Thu, Aug 1, 2013 at 9:46 AM, Will McLaren <wm2 at ebi.ac.uk> wrote:
>>
>>> Hi Duarte,
>>>
>>> I think this is a bug I've already found and fixed - can you update your
>>> ensembl-variation API and try again?
>>>
>>> Here's the fix for reference:
>>>
>>>
>>> http://cvs.sanger.ac.uk/cgi-bin/viewvc.cgi/ensembl-variation/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm?root=ensembl&r1=1.101.2.4&r2=1.101.2.5
>>>
>>> Will
>>>
>>>
>>> On 1 August 2013 09:33, Duarte Molha <duartemolha at gmail.com> wrote:
>>>
>>>> I believe the main problem is that this Variation feature, for some
>>>> reason does not have a splice attached to it : 'slice' => undef
>>>>
>>>> so the method to extract the slice and expand
>>>>
>>>>
>>>> 471:                my $slice = $self->feature_Slice->expand(
>>>> 472:                    MAX_DISTANCE_FROM_TRANSCRIPT,
>>>> 473:                    MAX_DISTANCE_FROM_TRANSCRIPT
>>>> 474:                );
>>>>
>>>> Fails.
>>>>
>>>> Anyone knows what might be causing this?
>>>>
>>>> Best regards
>>>>
>>>> Duarte
>>>>
>>>>
>>>>
>>>> =========================
>>>>      Duarte Miguel Paulo Molha
>>>>          http://about.me/duarte
>>>> =========================
>>>>
>>>>
>>>> On Wed, Jul 31, 2013 at 5:05 PM, Duarte Molha <duartemolha at gmail.com>wrote:
>>>>
>>>>> In an effort to understand better what might be causing this ... here
>>>>> is a dumb of one such object causing the error message:
>>>>>
>>>>> the VCF line:
>>>>> #CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT
>>>>>  sample-01  sample-02  sample-03 sample-04 sample-05 sample-06
>>>>> 1 I777437 I. IA IC I667.93 IPASS
>>>>> IAC=1;AF=0.083;AN=12;BaseQRankSum=6.089;DP=487;Dels=0.00;FS=1.662;HRun=0;HaplotypeScore=0.9613;MQ=44.32;MQ0=49;MQRankSum=-1.435;QD=8.35;ReadPosRankSum=-1.520;SB=-247.14;set=variant2
>>>>> IGT:AD:DP:GQ:PL I0/0:60,0:60:99:0,138,1819 I0/0:82,0:82:99:0,205,2614
>>>>> I0/1:52,28:80:99:706,0,1290 I0/0:100,0:100:99:0,253,3074
>>>>> I0/0:83,0:83:99:0,178,2360 I0/0:82,0:82:99:0,166,2135
>>>>>
>>>>> at line 471 of /Bio/EnsEMBL/Variation/VariationFeature.pm
>>>>>
>>>>> my object $self contains:
>>>>>
>>>>> 0  Bio::EnsEMBL::Variation::VariationFeature=HASH(0x57802b0)
>>>>>    '_line' =>
>>>>> "1\cI777437\cI.\cIA\cIC\cI667.93\cIPASS\cIAC=1;AF=0.083;AN=12;BaseQRankSum=6.089;DP=487;Dels=0.00;FS=1.662;HRun=0;HaplotypeScore=0.9613;MQ=44.32;MQ0=49;MQRankSum=-1.435;QD=8.35;ReadPosRankSum=-1.520;SB=-247.14;set=variant2\cIGT:AD:DP:GQ:PL\cI0/0:60,0:60:99:0,138,1819\cI0/0:82,0:82:99:0,205,2614\cI0/1:52,28:80:99:706,0,1290\cI0/0:100,0:100:99:0,253,3074\cI0/0:83,0:83:99:0,178,2360\cI0/0:82,0:82:99:0,166,2135"
>>>>>    'adaptor' =>
>>>>> Bio::EnsEMBL::Variation::DBSQL::VariationFeatureAdaptor=HASH(0x4b06030)
>>>>>       '_is_multispecies' => ''
>>>>>       'db' => Bio::EnsEMBL::Variation::DBSQL::DBAdaptor=HASH(0x52b4f08)
>>>>>          '_dbc' => Bio::EnsEMBL::DBSQL::DBConnection=HASH(0x52b50d0)
>>>>>             '_dbname' => 'homo_sapiens_variation_72_37'
>>>>>             '_driver' => 'mysql'
>>>>>             '_host' => 'ensembldb.ensembl.org'
>>>>>             '_port' => 5306
>>>>>             '_query_count' => 4
>>>>>             '_timeout' => 0
>>>>>             '_username' => 'anonymous'
>>>>>             'connected32406' => 1
>>>>>             'db_handle32406' => DBI::db=HASH(0x51e5ee8)
>>>>>                  empty hash
>>>>>             'reconnect_when_lost' => 1
>>>>>          '_group' => 'variation'
>>>>>          '_is_multispecies' => ''
>>>>>          '_no_cache' => 1
>>>>>          '_species' => 'homo_sapiens'
>>>>>          '_species_id' => 1
>>>>>       'dbc' => Bio::EnsEMBL::DBSQL::DBConnection=HASH(0x52b50d0)
>>>>>          -> REUSED_ADDRESS
>>>>>       'species_id' => 1
>>>>>    'allele_string' => 'A'
>>>>>    'chr' => 1
>>>>>    'end' => 777437
>>>>>    'existing' => ARRAY(0x10861900)
>>>>>         empty array
>>>>>    'genotype' => ARRAY(0x577ff80)
>>>>>       0  'A'
>>>>>       1  'A'
>>>>>    'individual' => 'sample-01'
>>>>>    'map_weight' => 1
>>>>>    'non_variant' => 1
>>>>>    'phased' => 1
>>>>>    'slice' => undef
>>>>>    'start' => 777437
>>>>>    'strand' => 1
>>>>>    'variation_name' => '1_777437_A'
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> =========================
>>>>>      Duarte Miguel Paulo Molha
>>>>>          http://about.me/duarte
>>>>> =========================
>>>>>
>>>>>
>>>>> On 31 July 2013 13:52, Duarte Molha <duartemolha at gmail.com> wrote:
>>>>>
>>>>>> Hi Devs
>>>>>>
>>>>>>  I have been trying to run a VCF file by the variant annotation
>>>>>> script and I've been getting a warning message that I have never before
>>>>>> encountered..
>>>>>>
>>>>>>  I was wondering if someone could let me know if it is something I am
>>>>>> doing wrong…
>>>>>>
>>>>>>  The message is :
>>>>>>
>>>>>>  *Can't call method "expand" on an undefined value at
>>>>>> <sic>/Bio/EnsEMBL/Variation/VariationFeature.pm line 471*
>>>>>>
>>>>>> **
>>>>>>
>>>>>> * *
>>>>>>
>>>>>> Here are the configuration options I am using:
>>>>>>
>>>>>>
>>>>>>
>>>>>> Configuration options:
>>>>>>
>>>>>>  ###
>>>>>>
>>>>>> allow_non_variant    1
>>>>>>
>>>>>> cache                1
>>>>>>
>>>>>> canonical            1
>>>>>>
>>>>>> ccds                 1
>>>>>>
>>>>>> check_alleles        1
>>>>>>
>>>>>> check_existing       1
>>>>>>
>>>>>> config               vep_human.ini
>>>>>>
>>>>>> core_type            core
>>>>>>
>>>>>> custom
>>>>>> ./vep_additional_annotations/Somatic_variation_phenotypes.bed.gz,Somatic,bed,exact
>>>>>> ./vep_additional_annotations/dbsnp135_ensembl_variation_phenotype.bed.gz,dbsnp135,bed,exact
>>>>>>
>>>>>> db_version           72
>>>>>>
>>>>>> dir                  /ReferenceData/vep_cache
>>>>>>
>>>>>> dir_cache            /ReferenceData/vep_cache
>>>>>>
>>>>>> dir_plugins          ./Plugins
>>>>>>
>>>>>> domains              1
>>>>>>
>>>>>> force_overwrite      1
>>>>>>
>>>>>> fork                 5
>>>>>>
>>>>>> gmaf                 1
>>>>>>
>>>>>> hgnc                 1
>>>>>>
>>>>>> host                 ensembldb.ensembl.org
>>>>>>
>>>>>> individual           all
>>>>>>
>>>>>> input_file           All_BOTH_SNPINDELfilter_PASSED.vcf
>>>>>>
>>>>>> maf_1kg              1
>>>>>>
>>>>>> numbers              1
>>>>>>
>>>>>> output_file          All_BOTH_SNPINDELfilter_PASSED.ann
>>>>>>
>>>>>> plugin
>>>>>> OGT_NHBLI_MAF,/ReferenceData/NHLBI_EVS/NHLBI_OGT.gz
>>>>>> OGT_selected_VCF_fields_output  Blosum62        Carol   OGT_Condel,b
>>>>>> OGT_Grantham    TSSDistance     Downstream
>>>>>>
>>>>>> polyphen             b
>>>>>>
>>>>>> port                 5306
>>>>>>
>>>>>> protein              1
>>>>>>
>>>>>> regulatory           1
>>>>>>
>>>>>> sift                 b
>>>>>>
>>>>>> species              homo_sapiens
>>>>>>
>>>>>> stats                HASH(0x4370ad8)
>>>>>>
>>>>>> terms                SO
>>>>>>
>>>>>> verbose              1
>>>>>>
>>>>>>
>>>>>>
>>>>>> I would be very grateful for your help.
>>>>>>
>>>>>>  Duarte Molha
>>>>>>
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> Dev mailing list    Dev at ensembl.org
>>>> Posting guidelines and subscribe/unsubscribe info:
>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>> Ensembl Blog: http://www.ensembl.info/
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Dev mailing list    Dev at ensembl.org
>>> Posting guidelines and subscribe/unsubscribe info:
>>> http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog: http://www.ensembl.info/
>>>
>>>
>>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20130802/c671ca72/attachment.html>


More information about the Dev mailing list