[ensembl-dev] ensemblgenomes compara databases

PATERSON Trevor trevor.paterson at roslin.ed.ac.uk
Wed Feb 23 12:42:56 GMT 2011


 
Ta thats very helpful

1) I take it species.division isn't listed for ensembl.org databases because they all belong to the 'multi' group.

2) Regards the second point: I don't think each compara database contains pairwise comparisons between all members

For example in bacteria a cursory examination suggests that all members of the ecoli collection are compared with each other - but not with members of the bacillus collection

So, in the case of no homology being found for a gene,  I think you have to pull out the species_sets to know whether a comparison was done....

3) In  the Pan compara database - I take it you just need to look and see which species are in it (in genome_db), to know whether you can search a species for homologies...

Thanks again

Trevor Paterson PhD
email trevor.paterson at roslin.ed.ac.uk

Bioinformatics 
The Roslin Institute
The Royal (Dick) School of Veterinary Studies
University of Edinburgh
Scotland EH25 9PS
phone +44 (0)131 5274197
http://bioinformatics.roslin.ed.ac.uk/

Please consider the environment before printing this e-mail

The University of Edinburgh is a charitable body, registered in Scotland with registration number SC005336
Disclaimer:This e-mail and any attachments are confidential and intended solely for the use of the recipient(s) to whom they are addressed. If you have received it in error, please destroy all copies and inform the sender. 



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


-----Original Message-----
From: Dan Staines [mailto:dstaines at ebi.ac.uk] 
Sent: 23 February 2011 12:30
To: PATERSON Trevor
Cc: 'dev at ensembl.org'
Subject: Re: [ensembl-dev] ensemblgenomes compara databases



On 02/23/2011 12:22 PM, PATERSON Trevor wrote:
> Do the individual core databases contain information about which group the species belongs to? [it would be nice if this was listed in 'meta' table].

yes, this is in the meta table as "species.division" e.g. 
"EnsemblMetazoa". You could use this to work out which division-specific compara db to use.

 > And is meta data stored about which species are used in pairwise comaprisons in each compara database (without having to query the compara database for which genome databases it references).

Not sure what you mean here - can you give me an example of what you need to do?

 > Also what is the compara_pan database?? - the contents of all the other databases??

Pan compara is a peptide compara database produced from a set of selected species that are taken from all EnsemblGenomes divisions and from Ensembl (but doesn't include all species from all divisions).

Dan.

-- 
Dan Staines, PhD               Ensembl Genomes Technical Coordinator
EMBL-EBI                       Tel: +44-(0)1223-492507
Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
Cambridge CB10 1SD, UK         http://www.ensemblgenomes.org/




More information about the Dev mailing list