[ensembl-dev] Fwd: problem in variant effect predictor stand alone

ketan padiya ketanmicro at gmail.com
Tue Feb 22 09:25:46 GMT 2011


---------- Forwarded message ----------
From: ketan padiya <ketanmicro at gmail.com>
Date: Tue, Feb 22, 2011 at 11:42 AM
Subject: Re: [ensembl-dev] problem in variant effect predictor stand alone
To: Will McLaren <wm2 at ebi.ac.uk>



Hi Will,

Still the new perl script giving errors,

[orf at localhost new]$ perl variant_effect_predictor.pl -i
../../samtools-0.1.12a/samfiles/q20/vcf/GKUNU9Q04_chr28_q20_sort.vcf -o
chr28.txt -s cow -w -b 1000
WARNING: Start 7355 or end . coordinate invalid on line 4
WARNING: Start 27062066 or end . coordinate invalid on line 356
WARNING: Start 28411255 or end . coordinate invalid on line 492
WARNING: Start 40292556 or end . coordinate invalid on line 596

Analyzing chromosome 28 at variant_effect_predictor.pl line 622, <GEN0> line
688.

 - fetched 382 transcripts at variant_effect_predictor.pl line 632, <GEN0>
line 688.

Could not connect to database bos_taurus_core_61_4j as user anonymous using
[DBI:mysql:database=bos_taurus_core_61_4j;host=ensembldb.ensembl.org;port=5306]
as a locator:
Unknown MySQL server host 'ensembldb.ensembl.org' (2) at
/home/orf/EnsEMBL/src/ensembl/modules/Bio/EnsEMBL/DBSQL/DBConnection.pm line
290, <GEN0> line 688.

-------------------- EXCEPTION --------------------
MSG: Could not connect to database bos_taurus_core_61_4j as user anonymous
using [DBI:mysql:database=bos_taurus_core_61_4j;host=ensembldb.ensembl.org;port=5306]
as a locator:
Unknown MySQL server host 'ensembldb.ensembl.org' (2)
STACK Bio::EnsEMBL::DBSQL::DBConnection::connect
/home/orf/EnsEMBL/src/ensembl/modules/Bio/EnsEMBL/DBSQL/DBConnection.pm:299
STACK Bio::EnsEMBL::DBSQL::DBConnection::db_handle
/home/orf/EnsEMBL/src/ensembl/modules/Bio/EnsEMBL/DBSQL/DBConnection.pm:618
STACK Bio::EnsEMBL::DBSQL::DBConnection::prepare
/home/orf/EnsEMBL/src/ensembl/modules/Bio/EnsEMBL/DBSQL/DBConnection.pm:647
STACK Bio::EnsEMBL::DBSQL::BaseAdaptor::prepare
/home/orf/EnsEMBL/src/ensembl/modules/Bio/EnsEMBL/DBSQL/BaseAdaptor.pm:164
STACK Bio::EnsEMBL::DBSQL::AttributeAdaptor::fetch_all_by_
/home/orf/EnsEMBL/src/ensembl/modules/Bio/EnsEMBL/DBSQL/AttributeAdaptor.pm:282
STACK Bio::EnsEMBL::DBSQL::AttributeAdaptor::AUTOLOAD
/home/orf/EnsEMBL/src/ensembl/modules/Bio/EnsEMBL/DBSQL/AttributeAdaptor.pm:100
STACK Bio::EnsEMBL::Slice::get_all_Attributes
/home/orf/EnsEMBL/src/ensembl/modules/Bio/EnsEMBL/Slice.pm:1237
STACK Bio::EnsEMBL::Slice::is_circular
/home/orf/EnsEMBL/src/ensembl/modules/Bio/EnsEMBL/Slice.pm:536
STACK Bio::EnsEMBL::Slice::project
/home/orf/EnsEMBL/src/ensembl/modules/Bio/EnsEMBL/Slice.pm:881
STACK Bio::EnsEMBL::DBSQL::SequenceAdaptor::fetch_by_Slice_start_end_strand
/home/orf/EnsEMBL/src/ensembl/modules/Bio/EnsEMBL/DBSQL/SequenceAdaptor.pm:220
STACK Bio::EnsEMBL::Slice::subseq
/home/orf/EnsEMBL/src/ensembl/modules/Bio/EnsEMBL/Slice.pm:642
STACK Bio::EnsEMBL::Exon::seq
/home/orf/EnsEMBL/src/ensembl/modules/Bio/EnsEMBL/Exon.pm:1467
STACK Bio::EnsEMBL::Transcript::spliced_seq
/home/orf/EnsEMBL/src/ensembl/modules/Bio/EnsEMBL/Transcript.pm:768
STACK Bio::EnsEMBL::Transcript::translateable_seq
/home/orf/EnsEMBL/src/ensembl/modules/Bio/EnsEMBL/Transcript.pm:824
STACK Bio::EnsEMBL::Transcript::translate
/home/orf/EnsEMBL/src/ensembl/modules/Bio/EnsEMBL/Transcript.pm:1655
STACK Bio::EnsEMBL::Utils::TranscriptAlleles::type_variation
/home/orf/EnsEMBL/src/ensembl/modules/Bio/EnsEMBL/Utils/TranscriptAlleles.pm:289
STACK
Bio::EnsEMBL::Variation::DBSQL::TranscriptVariationAdaptor::_calc_consequences
/home/orf/EnsEMBL/src/ensembl-variation/modules/Bio/EnsEMBL/Variation/DBSQL/TranscriptVariationAdaptor.pm:586
STACK main::whole_genome_fetch variant_effect_predictor.pl:659
STACK toplevel variant_effect_predictor.pl:372
Ensembl API version = 61
---------------------------------------------------

Please help me to rum script successfully

Thanks



On Mon, Feb 21, 2011 at 4:04 PM, Will McLaren <wm2 at ebi.ac.uk> wrote:

> Hi Ketan,
>
> I apologise, there were a couple of bugs which where giving you your
> problems.
>
> They are now fixed, I have attached the patched file here.
>
> You should rerun this code on your data files, even if they worked OK
> before, as there was a bug which would have affected all insertions of 1bp.
>
> Thanks
>
> Will
>
>
> On 16 February 2011 04:59, ketan padiya <ketanmicro at gmail.com> wrote:
>
>> Thanks for quick reply,
>>
>> Sending the input lines causing problems
>>
>> chr1    5959615    .    tatggtccaatggtccaa    tatggtccaa    44.2    .
>> INDEL;DP=2;AF1=1;CI95=0.5,1;DP4=0,0,0,2;MQ=60 PL:GT:GQ   83,6,0:1/1:49
>> chr1    108473380    .    taaaa    taaa    3.66    .
>> INDEL;DP=2;AF1=1;CI95=0.5,1;DP4=0,0,2,0;MQ=60    PL:GT:GQ    40,6,0:1/1:49
>> chr1    127416494    .    acc    ac    3.66    .
>> INDEL;DP=3;AF1=1;CI95=0.5,1;DP4=0,0,0,2;MQ=60    PL:GT:GQ    40,6,0:1/1:49
>>
>> and there go'se another problem,
>>
>> When applying following command, it did well with chr1 and gave results,
>> all vcf extracted using samtools
>>
>> [orf at localhost snp_effect_predictor]$ perl variant_effect_predictor.pl -i
>> ../samtools-0.1.12a/samfiles/q20/vcf/GKUNU9Q04_chr3_q20_sort.vcf -o chr3.txt
>> -format vcf -s cow -w -b 5000
>>
>> -------------------- EXCEPTION --------------------
>> MSG: Start must be less than or equal to end+1
>> STACK Bio::EnsEMBL::Feature::new
>> /home/orf/EnsEMBL/src/ensembl/modules/Bio/EnsEMBL/Feature.pm:139
>> STACK Bio::EnsEMBL::Variation::VariationFeature::new
>> /home/orf/EnsEMBL/src/ensembl-variation/modules/Bio/EnsEMBL/Variation/VariationFeature.pm:177
>> STACK toplevel variant_effect_predictor.pl:304
>> Ensembl API version = 61
>> ---------------------------------------------------
>> I am also attaching vcf file for this chr 3.
>>
>>
>>
>>
>>
>> On Wed, Feb 16, 2011 at 10:14 AM, Will McLaren <wm2 at ebi.ac.uk> wrote:
>>
>>> Dear Ketyan,
>>>
>>> Are you able to send me the input lines that are causing these errors? It
>>> is difficult for me to diagnose the problem without seeing the data.
>>>
>>> Thanks
>>>
>>> Will
>>>
>>>
>>> On 16 February 2011 13:08, ketan padiya <ketanmicro at gmail.com> wrote:
>>>
>>>> Thanks for reply, It worked well.
>>>>
>>>> One more question for INDEL, variant_effect_predictor perl script don't
>>>> recognize INDEL string and gives warning like,
>>>>
>>>> WARNING: Invalid allele string atggtccaatggtccaa/atggtccaa on line 92
>>>> WARNING: Invalid allele string aaaa/aaa on line 1249
>>>> WARNING: Invalid allele string cc/c on line 1378
>>>>
>>>>
>>>>
>>>> On Tue, Feb 15, 2011 at 6:37 PM, Will McLaren <wm2 at ebi.ac.uk> wrote:
>>>>
>>>>> Dear Ketyan,
>>>>>
>>>>> On 15 February 2011 20:13, ketan padiya <ketanmicro at gmail.com> wrote:
>>>>>
>>>>>> I have downloaded variant effect predictor and EnsEMBL API for that my
>>>>>> problems are,
>>>>>>
>>>>>> 1) After every reboot of system i have to give the PERL5LIB path, Why?
>>>>>>
>>>>>
>>>>> You can configure your system to load this PERL5LIB automatically every
>>>>> time it starts. How you do this depends on what type of system you are
>>>>> using. For example, if you are using the CSH shell system, you can edit the
>>>>> file named ".cshrc" in your home directory, then add lines like:
>>>>>
>>>>> setenv PERL5LIB ${PERL5LIB}:${HOME}/src/ensembl/modules
>>>>> setenv PERL5LIB ${PERL5LIB}:${HOME}/srcl/ensembl-compara/modules
>>>>> setenv PERL5LIB ${PERL5LIB}:${HOME}/src/ensembl-functgenomics/modules
>>>>> setenv PERL5LIB ${PERL5LIB}:${HOME}/src/ensembl-variation/modules
>>>>>
>>>>> Or, if you use Bash, you can add the lines to ".bashrc", also in your
>>>>> home directory:
>>>>>
>>>>> PERL5LIB=${PERL5LIB}:${HOME}/src/bioperl-live
>>>>> PERL5LIB=${PERL5LIB}:${HOME}/src/ensembl/modules
>>>>> PERL5LIB=${PERL5LIB}:${HOME}/src/ensembl-compara/modules
>>>>> PERL5LIB=${PERL5LIB}:${HOME}/src/ensembl-variation/modules
>>>>> PERL5LIB=${PERL5LIB}:${HOME}/src/ensembl-functgenomics/modules
>>>>> export PERL5LIB
>>>>>
>>>>>
>>>>>
>>>>>> 2) Variant effect predictor is taking too long to process given vcf
>>>>>> file (~2000 line of SNP/INDELs)
>>>>>>
>>>>>
>>>>> For this and point 3), I recommend you try adding the following flags
>>>>> to your command:
>>>>>
>>>>> perl variant_effect_predictor.pl -i
>>>>> ../samtools-0.1.12a/samfiles/q20/vcf/GKUNU9Q04_chr1_q20_sort.vcf -o chr1.txt
>>>>> -format vcf -s cow -w -b 5000
>>>>>
>>>>> Using -format forces the program to read your file as a VCF.
>>>>>
>>>>> Using -w means the script runs in "whole-genome" mode, which is better
>>>>> suited to large data sets that cover, for example, one chromosome. You
>>>>> should make sure when you use this that the VCF input file is sorted by
>>>>> chromosome and then position.
>>>>>
>>>>> Setting a larger buffer size with -b helps whole genome mode work
>>>>> faster.
>>>>>
>>>>> Please note that the -w option is only available from version 61 of
>>>>> Ensembl.
>>>>>
>>>>> Thanks and good luck!
>>>>>
>>>>> Will McLaren
>>>>> Ensembl Variation
>>>>>
>>>>>
>>>>>
>>>>>>  3) In the end it gives error,
>>>>>>
>>>>>> [orf at localhost variant_effect_predictor]$ perl
>>>>>> variant_effect_predictor.pl -i
>>>>>> ../samtools-0.1.12a/samfiles/q20/vcf/GKUNU9Q04_chr1_q20_sort.vcf -o chr1.txt
>>>>>> -s cow
>>>>>> WARNING: Start 5959615 or end . coordinate invalid on line
>>>>>> 92                                           / INDEL
>>>>>> WARNING: Start 30571012 or end . coordinate invalid on line 572
>>>>>>                                  / INDEL
>>>>>> WARNING: Start 64306203 or end . coordinate invalid on line 819
>>>>>>                                  / INDEL
>>>>>> WARNING: Start 76575493 or end . coordinate invalid on line 895
>>>>>>                                  / INDEL
>>>>>> DBD::mysql::st execute failed: Lost connection to MySQL server during
>>>>>> query at
>>>>>> /home/orf/EnsEMBL/src/ensembl/modules/Bio/EnsEMBL/DBSQL/BaseAdaptor.pm line
>>>>>> 521, <GEN0> line 1006.
>>>>>>
>>>>>> -------------------- EXCEPTION --------------------
>>>>>> MSG: Detected an error whilst executing SQL 'SELECT
>>>>>> vf.variation_feature_id, vf.seq_region_id, vf.seq_region_start,
>>>>>> vf.seq_region_end, vf.seq_region_strand, vf.variation_id, vf.allele_string,
>>>>>> vf.variation_name, vf.map_weight, s.name, s.somatic,
>>>>>> vf.validation_status, vf.consequence_type, vf.class_so_id
>>>>>> FROM ( (variation_feature vf, source s)
>>>>>>   LEFT JOIN failed_variation fv ON fv.variation_id = vf.variation_id )
>>>>>>
>>>>>>  WHERE s.somatic = 0 AND
>>>>>>     (
>>>>>>         fv.variation_id IS NULL OR
>>>>>>         fv.subsnp_id IS NOT NULL
>>>>>>     )
>>>>>>      AND vf.seq_region_id = 142972 AND vf.seq_region_start <= 23402307
>>>>>> AND vf.seq_region_end >= 23402307 AND vf.seq_region_start >= 23401807  AND
>>>>>>        vf.source_id = s.source_id
>>>>>> ': DBD::mysql::st execute failed: Lost connection to MySQL server
>>>>>> during query at
>>>>>> /home/orf/EnsEMBL/src/ensembl/modules/Bio/EnsEMBL/DBSQL/BaseAdaptor.pm line
>>>>>> 521, <GEN0> line 1006.
>>>>>>
>>>>>> STACK Bio::EnsEMBL::DBSQL::BaseAdaptor::generic_fetch
>>>>>> /home/orf/EnsEMBL/src/ensembl/modules/Bio/EnsEMBL/DBSQL/BaseAdaptor.pm:522
>>>>>> STACK Bio::EnsEMBL::DBSQL::BaseFeatureAdaptor::_slice_fetch
>>>>>> /home/orf/EnsEMBL/src/ensembl/modules/Bio/EnsEMBL/DBSQL/BaseFeatureAdaptor.pm:495
>>>>>> STACK
>>>>>> Bio::EnsEMBL::DBSQL::BaseFeatureAdaptor::fetch_all_by_Slice_constraint
>>>>>> /home/orf/EnsEMBL/src/ensembl/modules/Bio/EnsEMBL/DBSQL/BaseFeatureAdaptor.pm:316
>>>>>> STACK
>>>>>> Bio::EnsEMBL::Variation::DBSQL::VariationFeatureAdaptor::fetch_all_by_Slice_constraint
>>>>>> /home/orf/EnsEMBL/src/ensembl-variation/modules/Bio/EnsEMBL/Variation/DBSQL/VariationFeatureAdaptor.pm:121
>>>>>> STACK
>>>>>> Bio::EnsEMBL::Variation::DBSQL::VariationFeatureAdaptor::fetch_all_by_Slice
>>>>>> /home/orf/EnsEMBL/src/ensembl-variation/modules/Bio/EnsEMBL/Variation/DBSQL/VariationFeatureAdaptor.pm:175
>>>>>> STACK main::print_consequences variant_effect_predictor.pl:318
>>>>>> STACK toplevel variant_effect_predictor.pl:289
>>>>>> Ensembl API version = 61
>>>>>> ---------------------------------------------------
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Ketan Padiya
>>>>>> Research Fellow
>>>>>> Anand Veterinary College
>>>>>> Gujarat
>>>>>> India.
>>>>>> +91 9428969448
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Dev mailing list
>>>>>> Dev at ensembl.org
>>>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Ketan Padiya
>>>> Research Fellow
>>>> Anand Veterinary College
>>>> Gujarat
>>>> India.
>>>> +91 9428969448
>>>>
>>>>
>>>> _______________________________________________
>>>> Dev mailing list
>>>> Dev at ensembl.org
>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>>
>>>>
>>>
>>
>>
>> --
>> Ketan Padiya
>> Research Fellow
>> Anand Veterinary College
>> Gujarat
>> India.
>> +91 9428969448
>>
>>
>


-- 
Ketan Padiya
Research Fellow
Anand Veterinary College
Gujarat
India.
+91 9428969448




-- 
Ketan Padiya
Research Fellow
Anand Veterinary College
Gujarat
India.
+91 9428969448
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20110222/897c37d2/attachment.html>


More information about the Dev mailing list