[ensembl-dev] ensemblgenomes compara databases

Dan Staines dstaines at ebi.ac.uk
Wed Feb 23 12:54:56 GMT 2011

On 02/23/2011 12:42 PM, PATERSON Trevor wrote:
> 1) I take it species.division isn't listed for ensembl.org databases because they all belong to the 'multi' group.

Yes, this is a key used only within Ensembl Genomes.

> 2) Regards the second point: I don't think each compara database contains pairwise comparisons between all members
> For example in bacteria a cursory examination suggests that all members of the ecoli collection are compared with each other - but not with members of the bacillus collection

Thats correct - bacteria is a slightly special case in that we compare 
members of each collection to each other, but not between the 
collections (its actually 10 comparas merged into one). In the other 
divisions, all species are compared to each other for the peptide 
comparisons, but only closely related species for DNA comparision.

> So, in the case of no homology being found for a gene,  I think you have to pull out the species_sets to know whether a comparison was done....
> 3) In  the Pan compara database - I take it you just need to look and see which species are in it (in genome_db), to know whether you can search a species for homologies...

Yes, you'd need to check to see which species were compared in each 
compara by querying across genome_db, species_set, 
method_link_species_set and method_link. Pan compara is just another 
peptide compara database so the same should apply. Let us know if you 
need any more info on this.



Dan Staines, PhD               Ensembl Genomes Technical Coordinator
EMBL-EBI                       Tel: +44-(0)1223-492507
Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
Cambridge CB10 1SD, UK         http://www.ensemblgenomes.org/

More information about the Dev mailing list