[ensembl-dev] [EnsemblGenomes] getting genus name -> for all species in EnsemblGenomes sites
Giuseppe Gallone
G.Gallone at sms.ed.ac.uk
Sun Aug 7 14:22:55 BST 2011
Hi,
I have some scripts that use the API to query EnsemblGenomes data. I was
wondering what is the best way to obtain an up-to-date list of
EnsemblGenomes species through the APIs subroutine.
In other terms, every time I run the scripts I need to know that, e.g.,
the "Metazoa" database includes the genera
'Acyrthosiphon'
'Aedes'
'Anopheles'
'Apis'
'Caenorhabditis'
'Culex'
'Daphnia'
'Drosophila'
'Ixodes'
'Nematostella'
'Pediculus'
'Pristionchus'
'Schistosoma'
'Pristionchus'
'Strongylocentrotus'
'Trichoplax'
while, e.g., 'fungi' includes
'Aspergillus'
'Fusarium'
'Gibberella'
'Nectria'
'Neosartorya'
'Neurospora '
'Puccinia'
'Saccharomyces'
'Schizosaccharomyces'
'Ustilago'
and so on for all five sites. At the moment, I have these genera
hard-coded using hashes linking them to their site name (Fusarium =>
fungi), but this is of course a sub-par solution as with every release
new genera might get added and I'd need to keep the hash current.
I DID try using the genome adaptor. I call it once for each site, then
get the genome names, and trim what's after the underscore the get the
genus list. Example:
my $genome_db_adaptor = Bio::EnsEMBL::Registry->get_adaptor('plants',
'compara', 'GenomeDB');
my $all_genome_dbs = $genome_db_adaptor->fetch_all();
foreach my $genome (@{$all_genome_dbs}){
$species_names{$genome->name} = 1;
}
...etc
The problem with using the genomedadaptors is that there is a mismatch
between the species indicated on the website and what I retrieve from
the api. For example, for plants, the website
http://plants.ensembl.org/info/about/species.html
reports for V.63:
'Arabidopsis'
'Brachypodium'
'Oryza'
'Physcomitrella'
'Populus'
'Sorghum'
'Vitis'
'Zea'
but what I get from the api is the following:
ancestral
arabidopsis
brachypodium
caenorhabditis
ciona
drosophila
homo
oryza
physcomitrella
populus
saccharomyces
sorghum
vitis
zea
and similarly for the other sites.
Thanks a lot for your work and for your suggestions about this.
Best,
Giuseppe
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
More information about the Dev
mailing list