[ensembl-dev] algorithm of FindSplitGenesOnTree
Pengcheng Yang
pengchy at gmail.com
Tue Mar 5 17:49:00 GMT 2013
Hi,
I want to know the algorithm of the FindSplitGenesOnTree class, so I
read the comments in the file
http://www.ensembl.org/info/docs/Doxygen/compara-api/FindSplitGenesOnTree_8pm_source.html.
However, I still unclear of the algorithm background of it.
My understanding is:
1. for the genes in one family, do multiple alignment and construct tree
using TreeBeST
2. find the gene ids with shortest (A) and longest (B) length.
3. get the gene ids (C) that next to gene A in the same branch in the tree
4. check whether C and A have overlap greater than x aa in the multiple
alignment. If not, they may be one split_gene pair.
Is it? And where to found the documentation of the algorithm? I know one
way was to read the source code, but it will be understood quickly if
there is a documentation.
Thank you.
Best,
Pengcheng Yang
More information about the Dev
mailing list