[ensembl-dev] Erroneous duplications in gene trees
Julien Roux
julien.roux at unil.ch
Tue Mar 10 13:10:56 GMT 2015
Dear Ensembl team,
I wanted to report a strange behavior/feature of the Compara Gene Trees.
I often see in gene trees that some genes do not branch where they are
supposed to, which is inevitable I guess. However in some cases, this
leads to inference of false duplications. I find it very useful that
Ensembl provides a confidence score for duplications, and labels
"dubious" duplications, those with a score of 0.
But I find surprising that a duplication with a score of 1% is labeled
as "real" duplication. I see this happen quite often, and it seems to me
that the threshold used to call a dubious duplication should be increased.
See for example the ENSAMXG00000008930 gene, which clusters outside of
the fish clade, leading to a false duplication at the basis of the
vertebrate lineage (besides, this gene seems to be only a fragment of
the gene model, which should be labeled as a gene_split event):
http://www.ensembl.org/Astyanax_mexicanus/Gene/Compara_Tree?db=core;g=ENSAMXG00000008930;r=KB882106.1:4258864-4262355;t=ENSAMXT00000009178;collapse=2625743,2625667,2625666,2625573
Maybe there is some justification for this choice, please let me know
what you think.
Best regards
Julien
--
Julien Roux
Marie-Curie postdoctoral fellow
Department of Ecology and Evolution, University of Lausanne, Switzerland
http://www.unil.ch/dee/home/menuinst/people/post-docs--associates/dr-julien-roux.html
Tel: +41 78 700 2931
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20150310/121441c6/attachment.html>
More information about the Dev
mailing list