[ensembl-dev] algorithm of FindSplitGenesOnTree

Pengcheng Yang pengchy at gmail.com
Tue Mar 5 17:49:00 GMT 2013


I want to know the algorithm of the FindSplitGenesOnTree class, so I 
read the comments in the file
However, I still unclear of the algorithm background of it.
My understanding is:
1. for the genes in one family, do multiple alignment and construct tree 
using TreeBeST
2. find the gene ids with shortest (A) and longest (B) length.
3. get the gene ids (C) that next to gene A in the same branch in the tree
4. check whether C and A have overlap greater than x aa in the multiple 
alignment. If not, they may be one split_gene pair.

Is it? And where to found the documentation of the algorithm? I know one 
way was to read the source code, but it will be understood quickly if 
there is a documentation.

Thank you.

Pengcheng Yang

More information about the Dev mailing list