[ensembl-dev] Orthologs in ensembl-genomes...MethodLinkSpeciesSets not found for two species: Bos taurus(Btau_4.0) and Rattus norvegicus(RGSC3.4)

Andy Yates ayates at ebi.ac.uk
Tue Aug 24 09:36:09 BST 2010


Hi Niran,

If you look here (linking from the EnsemblBacteria site but all sites will give this information) neither Rat nor Cow are present in the pan compara analysis.

http://bacteria.ensembl.org/info/docs/compara/homology_method.html#species

The aim of the compara analysis which is provided in the pan_homology database is to run Compara's GeneTree pipeline over as wide a spread of the taxonomy as possible. If you want to know what is available programatically you can search for all method link species sets with the method link type PROTEIN_TREES. There will only be one for the pan_homology database which is all the species which are involved in Pan Taxonomic compara. If you just want to iterate through the arrays you have of species then you could always change your API call to:

my $method_link_species_set = $mlss_adaptor->fetch_by_method_link_type_GenomeDBs("ENSEMBL_ORTHOLOGUES", $genome_dbs, 1);

if(! defined $method_link_species_set) {
  #Skip & try Ensembl's MULTI database
}

Which would prevent any odd looking warnings appearing to screen & so long as you deal with the undefined MLSS correctly this shouldn't be a problem.

On another note those species should be available in Ensembl's 58 database which you can connect to using the EnsemblGenomes 58 branch API.

Hope this helps,

Andy

On 23 Aug 2010, at 18:02, Niran Abeygunawardena wrote:

> Hi,
> 
> I'm using branch-ensemblgenomes-5-58 api to access orthologs. For most species pairs, it works except for combinations involving these two species:
> Bos taurus(Btau_4.0) and Rattus norvegicus(RGSC3.4)
> 
> 
> For example, I get the following error from my attached scripts:
> -------------------- WARNING ----------------------
> MSG: No Bio::EnsEMBL::Compara::MethodLinkSpeciesSet found for
>  <ENSEMBL_ORTHOLOGUES> and Arabidopsis thaliana(TAIR9), Rattus norvegicus(RGSC3.4)
> FILE: Compara/DBSQL/MethodLinkSpeciesSetAdaptor.pm LINE: 669
> CALLED BY: orthologs.pl  LINE: 96
> ---------------------------------------------------
> 
> -------------------- EXCEPTION --------------------
> MSG: method_link_species_set arg is required
> 
> STACK Bio::EnsEMBL::Compara::DBSQL::HomologyAdaptor::fetch_all_by_MethodLinkSpeciesSet /ebi/microarray/sw/dw/src/genomes-5/ensembl-compara/modules/Bio/EnsEMBL/Compara/DBSQL/HomologyAdaptor.pm:325
> STACK toplevel orthologs.pl:97
> ---------------------------------------------------
> 
> 
> Is there a reason why these two species are not used in the compara pipeline to find orthologs?
> 
> I found these two species orthologs are present in the branch-ensembl-59 api but I can't use it since I require orthologs from other species such as plants, fungi and bacteria....unless I call separately this api but prefer not if there is a good reason. Anyway, I only require orthologs pairs from these species combinations:
> anopheles gambiae
> arabidopsis thaliana
> bacillus subtilis
> bos taurus
> caenorhabditis elegans
> danio rerio
> drosophila melanogaster
> gallus gallus
> homo sapiens
> mus musculus
> rattus norvegicus
> saccharomyces cerevisiae
> schizosaccharomyces pombe
> 
> 
> I'm happy to listen to any help to improve performance or do this an alternative way..thanks :)
> 
> Best,
> Niran
> <orthologs.pl><orthologs.sh>_______________________________________________
> Dev mailing list
> Dev at ensembl.org
> http://lists.ensembl.org/mailman/listinfo/dev

-- 
Andrew Yates                   Ensembl Genomes Engineer
EMBL-EBI                       Tel: +44-(0)1223-492538
Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
Cambridge CB10 1SD, UK         http://www.ensemblgenomes.org/








More information about the Dev mailing list