[ensembl-dev] algorithm of FindSplitGenesOnTree
pengchy at gmail.com
Tue Mar 5 17:49:00 GMT 2013
I want to know the algorithm of the FindSplitGenesOnTree class, so I
read the comments in the file
However, I still unclear of the algorithm background of it.
My understanding is:
1. for the genes in one family, do multiple alignment and construct tree
2. find the gene ids with shortest (A) and longest (B) length.
3. get the gene ids (C) that next to gene A in the same branch in the tree
4. check whether C and A have overlap greater than x aa in the multiple
alignment. If not, they may be one split_gene pair.
Is it? And where to found the documentation of the algorithm? I know one
way was to read the source code, but it will be understood quickly if
there is a documentation.
More information about the Dev