[ensembl-dev] Bio::Species doesn't like scientific names containing brackets

Nick Langridge nickl at ebi.ac.uk
Wed Jun 29 15:06:03 BST 2011


Thanks Chris.

I can't avoid using the module it as the Ensembl API uses it, but I've 
ended up with a hacked version that works for our purposes.

Cheers,
Nick

On 29/06/2011 14:54, Chris Fields wrote:
> Just want to point out this problem has been generally solved in the latest BioPerl, in that we deprecated Bio::Species due to fussiness with parsing this data (it's just not possible to cover every edge case).  It's possible this has been fixed though.
>
> chris
>
> On Jun 29, 2011, at 6:02 AM, Nick Langridge wrote:
>
>    
>> Hi,
>>
>> I'm having problems with Bio::Species and species that have brackets in thier scientific names, e.g. "Buchnera aphidicola (subsp. Acyrthosiphon pisum, strain 5A)".
>>
>> Bio::Species tries to extract the genus, species, and subspecies from the scientific name, but it ends up with mismatched brackets, e.g.
>> genus:  "Buchnera"
>> species: "aphidicola (subsp."
>> subspecies: "Acyrthosiphon pisum, strain 5A)"
>>
>> This causes an 'Unmatched ( in regex' runtime error when the module later tries to use the species value directly in a regex.
>>
>> Does anyone know what should be happening here? Are brackets allowed, and if so, how should Bio::Species be dealing with them?
>>
>> The runtime error would be easy to trap by escaping the text in the regex, but I suspect that really the problem is that species/subspecies shouldn't contain brackets in the first place (?)
>>
>> Cheers,
>> Nick
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>      
>    




More information about the Dev mailing list