[ensembl-dev] variant effect predictor on bacteria genomes

Dan Staines dstaines at ebi.ac.uk
Wed Sep 19 12:00:23 BST 2012


This is because there is no single genome for Escherichia coli - we have 
37 different genomes for different strains of E. coli. If you want the 
K12 genome, I suggest you specify e_coli_k12.

Best,

Dan.

On 09/19/2012 11:56 AM, Weihong Qi wrote:
> Dear Will,
>
> Thanks for the information. I tried this ealier and got an error stated
> that escherichia coli is not a valid species name. I tried the same with
> arabidopsis thaliana and it worked. So I thought I need special settings
> for connecting to the ensembl Bacteria. Screen dumps are appended as
> reference.
>
> perl variant_effect_predictor.pl -i example.vcf -o test.vcf --genomes
> --species escherichia_coli
>
> -------------------- WARNING ----------------------
> MSG: escherichia_coli is not a valid species name (check DB and API version)
> FILE: Bio/EnsEMBL/Registry.pm LINE: 1172
> CALLED BY: variant_effect_predictor.pl  LINE: 653
> Ensembl API version = 67
> ---------------------------------------------------
>
> -------------------- WARNING ----------------------
> MSG: escherichia_coli is not a valid species name (check DB and API version)
> FILE: Bio/EnsEMBL/Registry.pm LINE: 1172
> CALLED BY: Bio/EnsEMBL/Registry.pm  LINE: 957
> Ensembl API version = 67
> ---------------------------------------------------
>
> -------------------- EXCEPTION --------------------
> MSG: Can not find internal name for species 'escherichia_coli'
> STACK Bio::EnsEMBL::Registry::get_adaptor
> /misc/ngseq/src/variant_effect_predictor/Bio/EnsEMBL/Registry.pm:959
> STACK main::get_adaptors variant_effect_predictor.pl:1054
> STACK main::configure variant_effect_predictor.pl:766
> STACK toplevel variant_effect_predictor.pl:66
> Ensembl API version = 67
> ---------------------------------------------------
>
>
> perl variant_effect_predictor.pl -i example.vcf -o test.vcf --genomes
> --species arabidopsis_thaliana
>
> 2012-09-19 12:50:31 - Starting...
> 2012-09-19 12:50:31 - Detected format of input file as vcf
> 2012-09-19 12:50:31 - Read 173 variants into buffer
> 2012-09-19 12:50:31 - Analyzing chromosome 21
> 2012-09-19 12:50:31 - Reading transcript data from cache and/or database
> [=======================================================================================================================================================================================================]
> [ 100% ]
> 2012-09-19 12:50:37 - Retrieved 0 transcripts (0 mem, 0 cached, 0 DB, 0
> duplicates)
> 2012-09-19 12:50:38 - Analyzing variants
> [=======================================================================================================================================================================================================]
> [ 100% ]
> 2012-09-19 12:50:38 - Calculating and writing output
> [=======================================================================================================================================================================================================]
> [ 100% ]
> 2012-09-19 12:50:38 - Analyzing chromosome 22
> 2012-09-19 12:50:38 - Reading transcript data from cache and/or database
> [=======================================================================================================================================================================================================]
> [ 100% ]
> 2012-09-19 12:51:10 - Retrieved 0 transcripts (0 mem, 0 cached, 0 DB, 0
> duplicates)
> 2012-09-19 12:51:11 - Analyzing variants
> [=======================================================================================================================================================================================================]
> [ 100% ]
> 2012-09-19 12:51:11 - Calculating and writing output
> [=======================================================================================================================================================================================================]
> [ 100% ]
> 2012-09-19 12:51:11 - Processed 173 total variants
> 2012-09-19 12:51:11 - Finished!
>
>
> On 9/19/2012 12:45 PM, Will McLaren wrote:
>> Hello,
>>
>> Yes this is possible assuming you know the species name for the
>> bacteria you wish to use.
>>
>> You can use the --genomes flag which is a shortcut that makes the
>> script connect to the Ensembl Genomes public server:
>>
>> perl variant_effect_predictor.pl -i variants.vcf --genomes --species
>> [species_name]
>>
>> Thanks
>>
>> Will McLaren
>> Ensembl Variation
>>
>> On 19 September 2012 10:55, Weihong Qi<Weihong.Qi at fgcz.ethz.ch>  wrote:
>>> Dear Ensembl developers,
>>>
>>> Can the variant effect predictor (perl script) connect to Ensembl Bacteria
>>> genomes? If it does, what will the command line switches be?
>>>
>>> Thanks,
>>>
>>> Weihong
>>>
>>> --
>>> Weihong Qi, PhD
>>> Functional Genomics Center Zurich
>>> Uni/ETH Zurich
>>> Winterthurerstrasse 190 / Y32 H66
>>> CH-8057 Zurich
>>>
>>> Phone (Fixed line office):  +41 44 635 3964
>>> Phone (Mobile office): +41 44 635 3997
>>> Fax:  +41 44 635 3922
>>> E-mail:weihong.qi at fgcz.ETHZ.ch
>>> Web:http://www.fgcz.ch
>>>
>>>
>>> _______________________________________________
>>> Dev mailing listDev at ensembl.org
>>> List admin (including subscribe/unsubscribe):
>>> http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog:http://www.ensembl.info/
>> _______________________________________________
>> Dev mailing listDev at ensembl.org
>> List admin (including subscribe/unsubscribe):http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog:http://www.ensembl.info/
>
>
> --
> Weihong Qi, PhD
> Functional Genomics Center Zurich
> Uni/ETH Zurich
> Winterthurerstrasse 190 / Y32 H66
> CH-8057 Zurich
>
> Phone (Fixed line office):  +41 44 635 3964
> Phone (Mobile office): +41 44 635 3997
> Fax:  +41 44 635 3922
> E-mail:weihong.qi at fgcz.ETHZ.ch
> Web:http://www.fgcz.ch
>
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>

-- 
Dan Staines, PhD               Ensembl Genomes Technical Coordinator
EMBL-EBI                       Tel: +44-(0)1223-492507
Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
Cambridge CB10 1SD, UK         http://www.ensemblgenomes.org/




More information about the Dev mailing list