[ensembl-dev] variant effect predictor on bacteria genomes

Weihong Qi Weihong.Qi at fgcz.ethz.ch
Wed Sep 19 12:24:12 BST 2012


Dear Dan,

Thanks. Now I got it work. The problem was I used 
"escherichia_coli_dh10b". After following your hint and set it to 
"e_coli_dh10b", it worked.

For such details, while is the best place to find help information? such 
as a list of valid names to be used with --species?

Thanks,

Weihong

On 9/19/2012 1:00 PM, Dan Staines wrote:
> This is because there is no single genome for Escherichia coli - we 
> have 37 different genomes for different strains of E. coli. If you 
> want the K12 genome, I suggest you specify e_coli_k12.
>
> Best,
>
> Dan.
>
> On 09/19/2012 11:56 AM, Weihong Qi wrote:
>> Dear Will,
>>
>> Thanks for the information. I tried this ealier and got an error stated
>> that escherichia coli is not a valid species name. I tried the same with
>> arabidopsis thaliana and it worked. So I thought I need special settings
>> for connecting to the ensembl Bacteria. Screen dumps are appended as
>> reference.
>>
>> perl variant_effect_predictor.pl -i example.vcf -o test.vcf --genomes
>> --species escherichia_coli
>>
>> -------------------- WARNING ----------------------
>> MSG: escherichia_coli is not a valid species name (check DB and API 
>> version)
>> FILE: Bio/EnsEMBL/Registry.pm LINE: 1172
>> CALLED BY: variant_effect_predictor.pl  LINE: 653
>> Ensembl API version = 67
>> ---------------------------------------------------
>>
>> -------------------- WARNING ----------------------
>> MSG: escherichia_coli is not a valid species name (check DB and API 
>> version)
>> FILE: Bio/EnsEMBL/Registry.pm LINE: 1172
>> CALLED BY: Bio/EnsEMBL/Registry.pm  LINE: 957
>> Ensembl API version = 67
>> ---------------------------------------------------
>>
>> -------------------- EXCEPTION --------------------
>> MSG: Can not find internal name for species 'escherichia_coli'
>> STACK Bio::EnsEMBL::Registry::get_adaptor
>> /misc/ngseq/src/variant_effect_predictor/Bio/EnsEMBL/Registry.pm:959
>> STACK main::get_adaptors variant_effect_predictor.pl:1054
>> STACK main::configure variant_effect_predictor.pl:766
>> STACK toplevel variant_effect_predictor.pl:66
>> Ensembl API version = 67
>> ---------------------------------------------------
>>
>>
>> perl variant_effect_predictor.pl -i example.vcf -o test.vcf --genomes
>> --species arabidopsis_thaliana
>>
>> 2012-09-19 12:50:31 - Starting...
>> 2012-09-19 12:50:31 - Detected format of input file as vcf
>> 2012-09-19 12:50:31 - Read 173 variants into buffer
>> 2012-09-19 12:50:31 - Analyzing chromosome 21
>> 2012-09-19 12:50:31 - Reading transcript data from cache and/or database
>> [=======================================================================================================================================================================================================] 
>>
>> [ 100% ]
>> 2012-09-19 12:50:37 - Retrieved 0 transcripts (0 mem, 0 cached, 0 DB, 0
>> duplicates)
>> 2012-09-19 12:50:38 - Analyzing variants
>> [=======================================================================================================================================================================================================] 
>>
>> [ 100% ]
>> 2012-09-19 12:50:38 - Calculating and writing output
>> [=======================================================================================================================================================================================================] 
>>
>> [ 100% ]
>> 2012-09-19 12:50:38 - Analyzing chromosome 22
>> 2012-09-19 12:50:38 - Reading transcript data from cache and/or database
>> [=======================================================================================================================================================================================================] 
>>
>> [ 100% ]
>> 2012-09-19 12:51:10 - Retrieved 0 transcripts (0 mem, 0 cached, 0 DB, 0
>> duplicates)
>> 2012-09-19 12:51:11 - Analyzing variants
>> [=======================================================================================================================================================================================================] 
>>
>> [ 100% ]
>> 2012-09-19 12:51:11 - Calculating and writing output
>> [=======================================================================================================================================================================================================] 
>>
>> [ 100% ]
>> 2012-09-19 12:51:11 - Processed 173 total variants
>> 2012-09-19 12:51:11 - Finished!
>>
>>
>> On 9/19/2012 12:45 PM, Will McLaren wrote:
>>> Hello,
>>>
>>> Yes this is possible assuming you know the species name for the
>>> bacteria you wish to use.
>>>
>>> You can use the --genomes flag which is a shortcut that makes the
>>> script connect to the Ensembl Genomes public server:
>>>
>>> perl variant_effect_predictor.pl -i variants.vcf --genomes --species
>>> [species_name]
>>>
>>> Thanks
>>>
>>> Will McLaren
>>> Ensembl Variation
>>>
>>> On 19 September 2012 10:55, Weihong Qi<Weihong.Qi at fgcz.ethz.ch>  wrote:
>>>> Dear Ensembl developers,
>>>>
>>>> Can the variant effect predictor (perl script) connect to Ensembl 
>>>> Bacteria
>>>> genomes? If it does, what will the command line switches be?
>>>>
>>>> Thanks,
>>>>
>>>> Weihong
>>>>
>>>> -- 
>>>> Weihong Qi, PhD
>>>> Functional Genomics Center Zurich
>>>> Uni/ETH Zurich
>>>> Winterthurerstrasse 190 / Y32 H66
>>>> CH-8057 Zurich
>>>>
>>>> Phone (Fixed line office):  +41 44 635 3964
>>>> Phone (Mobile office): +41 44 635 3997
>>>> Fax:  +41 44 635 3922
>>>> E-mail:weihong.qi at fgcz.ETHZ.ch
>>>> Web:http://www.fgcz.ch
>>>>
>>>>
>>>> _______________________________________________
>>>> Dev mailing listDev at ensembl.org
>>>> List admin (including subscribe/unsubscribe):
>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>> Ensembl Blog:http://www.ensembl.info/
>>> _______________________________________________
>>> Dev mailing listDev at ensembl.org
>>> List admin (including 
>>> subscribe/unsubscribe):http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog:http://www.ensembl.info/
>>
>>
>> -- 
>> Weihong Qi, PhD
>> Functional Genomics Center Zurich
>> Uni/ETH Zurich
>> Winterthurerstrasse 190 / Y32 H66
>> CH-8057 Zurich
>>
>> Phone (Fixed line office):  +41 44 635 3964
>> Phone (Mobile office): +41 44 635 3997
>> Fax:  +41 44 635 3922
>> E-mail:weihong.qi at fgcz.ETHZ.ch
>> Web:http://www.fgcz.ch
>>
>>
>>
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> List admin (including subscribe/unsubscribe): 
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>


-- 
Weihong Qi, PhD
Functional Genomics Center Zurich
Uni/ETH Zurich
Winterthurerstrasse 190 / Y32 H66
CH-8057 Zurich

Phone (Fixed line office):  +41 44 635 3964
Phone (Mobile office): +41 44 635 3997
Fax:  +41 44 635 3922
E-mail: weihong.qi at fgcz.ETHZ.ch
Web:  http://www.fgcz.ch





More information about the Dev mailing list