[ensembl-dev] Gorilla assembly coverage depth
Matthieu Muffato
muffato at ebi.ac.uk
Fri Sep 16 13:04:08 BST 2011
Hi Will and Greg
For the protein tree pipeline, this tag is used to select the genomes
for the dn/ds calculation. Because it was considered as a low-coverage
genome, we don't have any dn/ds value for gorilla vs * homologues.
By the way, some of the gorilla CDS sequences stored in the Compara
database are erroneous (http://www.ensembl.info/contact-us/known-bugs/),
so any comparative analysis using the gorilla should go to the core
database to fetch the CDS sequences (the protein sequences are unaffected)
Hope this helps,
Matthieu
On 15/09/11 22:56, William Spooner wrote:
> Thanks for the heads-up Greg,
>
> This meta_key is certainly used by the Compara ProteinTrees pipeline (Bio::EnsEMBL::Compara::RunnableDB::ProteinTrees::GroupGenomesUnderTaxa), although I don't know what the downstream ramifications of the 'low' (basically not 'high' or '6X') setting are. I tend to set everything to 'high' to be on the safe side.
>
> Will
>
> On 15 Sep 2011, at 18:36, Gregory Jordan wrote:
>
>> I understand that things in the 'meta' table tend to be for internal use only. But the assembly coverage depth information is only accessible from there, and surely this can't be accurate anymore:
>>
>>> mysql -uensro -hens-livemirror -e "select * from gorilla_gorilla_core_64_31.meta where meta_key='assembly.coverage_depth'\G"
>> *************************** 1. row ***************************
>> meta_id: 81
>> species_id: 1
>> meta_key: assembly.coverage_depth
>> meta_value: low
>>
>> I doubt many people are actually using this undocumented information... but it caught me off guard, and it would be a shame for someone attempting to filter out low-coverage genomes to end up throwing the baby out with the bathwater, so to speak!
>>
>> Cheers,
>> greg
>
> --
> William Spooner
> whs at eaglegenomics.com
> http://www.eaglegenomics.com
>
--
Matthieu Muffato, Ph.D.
Ensembl Developer - Comparative Genomics
European Bioinformatics Institute (EMBL-EBI)
Wellcome Trust Genome Campus, Hinxton
Cambridge, CB10 1SD, United Kingdom
More information about the Dev
mailing list