[ensembl-dev] Release 65 - 'not a valid species name' exception thrown by get_adaptor()

Andy Yates ayates at ebi.ac.uk
Wed Feb 1 09:58:12 GMT 2012


Hi Giuseppe,

On 31 Jan 2012, at 18:43, Giuseppe G wrote:

> 
> ----
> 
> I have one last question about this. In the case of Plasmodium falciparum, with the 'pan_homology' db I'm still able to get a $gene_adaptor in (1). The code will however fail at (2). Is this expected behaviour and related to your explanation above?
> 

This is expected since Plasmodium falciparum 3D7 is annotated at the strain level and not the species level so the correct taxonomic identifier for this is 36329 & not 5833. Ensembl Genomes spent a long time last release ensuring that taxonomic identifiers are in sync with UniProt.


> My understanding of it at the moment is that if the taxon name exists in the core database meta table, I will get a gene adaptor - even if the taxon is not amongst the 605 species contained in the pan_homology DB. Is this correct?

Your understanding is correct that an alias could be registered which matches the name from a node in the taxonomy but that alias has been linked to another taxonomic level. This could have issues though if the species an alias is assigned to changes. For example HB3 becomes the canonical strain of & the alias "Plasmodium falciparum" moves from 3D7 to the HB3. Your code would use HB3 as the species of interest & not 3D7 which is what the external resource has used. I would avoid using the name at all cost instead using just the taxon identifiers e.g.

my %gdb_hash = map { $_->taxon_id => $_ } @{$genomedb_adaptor->fetch_all()};
my $human = $gdb_hash{'9606'};
my $plas = $gdb_hash{'36329'};
my $orthologues_mlss = $mlssa->fetch_by_method_link_type_GenomeDBs('ENSEMBL_ORTHOLOGUES',[$plas,$human]);

If for some reason you miss out on a taxon it would be better to move up the ncbi taxonomy & re-attempt the fetch

Best regards,

Andy



More information about the Dev mailing list