[ensembl-dev] get conservation of a region based on multi-species genomic alignment

Matthieu Muffato muffato at ebi.ac.uk
Mon Jun 8 12:41:45 BST 2015


Dear Julien,

We have experience with GERP, and we routinely run it on the 39-way and 
other sets (the ConservationScore object you've found in our API). We 
don't do it for primates though, so you could use our API to dig 
conservation between all *mammals*, not only primates.
The score() method of GenomicAlignBlock returns the LASTZ score of the 
aligned regions. It's not a measure of conservation

I think the main question is: do you want a score that reflects the 
conservation of the region amongst the 8 species ? Or something that 
tells how much the human sequence is shared by the other primates ? 
(GERP is of the 1st category, for instance).
Then how do you want to interpret the conservation score ? Similarity 
between the primates, or presence of an ancestral state ? We could 
imagine that a human region is not conserved amongst all primates, but 
still is the same as in the last common ancestor of primates / mammals

Matthieu

On 03/06/15 14:52, Julien Roux wrote:
> Dear list,
> I am trying to retrieve, for a given human genomic region, the
> nucleotidic sequence conservation based on the EPO 8-ways genomic
> alignments.
> I successfully managed to extract the genomic alignment of a specific
> human region (see attached script) using help from the tutorial
> (www.ensembl.org/info/docs/api/compara/compara_tutorial.html), and with
> the MethodLinkSpeciesSetAdaptor parameters detailed here:
> http://www.ensembl.org/info/genome/compara/analyses.html
> Thanks for that!
>
> Now, I would like a conservation score for this region. A sensible score
> could for example be the % of the human nucleotides in this region
> showing a perfect conservation in the 8 species. Or it could be the
> average across all nucleotides in this region of the number of species
> showing the same nucleotide than human. Do you have suggestions, based
> on your experience on which measure to consider?
>
> And before implementing this manually, I would like to be sure that
> something similar is not already implemented or easy to extract from the
> API? For example I've seen a score() function for
> Bio::EnsEMBL::Compara::GenomicAlignBlock objects. What does it
> correspond to?
> http://www.ensembl.org/info/docs/Doxygen/compara-api/classBio_1_1EnsEMBL_1_1Compara_1_1GenomicAlignBlock.html#aa1480f9cf9068f9102f67e8b586f1cce
> I also found this, but I have no idea if it can be used for genomic
> alignments:
> http://www.ensembl.org/info/docs/Doxygen/compara-api/classBio_1_1EnsEMBL_1_1Compara_1_1ConservationScore.html
>
> Thanks for any tip!
> Best
> Julien
>
> --
> Julien Roux
> Marie-Curie postdoctoral fellow
> Department of Ecology and Evolution, University of Lausanne, Switzerland
> http://www.unil.ch/dee/home/menuinst/people/post-docs--associates/dr-julien-roux.html
> Tel: +41 78 700 2931
>
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>

-- 
Matthieu Muffato, Ph.D.
Ensembl Compara and TreeFam Project Leader
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus, Hinxton
Cambridge, CB10 1SD, United Kingdom
Room  A3-145
Phone + 44 (0) 1223 49 4631
Fax   + 44 (0) 1223 49 4468




More information about the Dev mailing list