[ensembl-dev] ortholog vs paralog?

Javier Herrero jherrero at ebi.ac.uk
Thu Jul 5 20:39:45 BST 2012

Dear Jim

Looking at the description should work as you intend.

Split genes are an exceptional case of paralogues that are not 
paralogues. In other words, the tree says they are paralogues, but in 
reality they are two halves of the same gene. As such, they aren't 
homologues, even if they are in the same tree.

Kind regards


On 05/07/12 19:47, Thomason, James wrote:
> Howdy,
> Could somebody point me towards the proper way to differentiate between an orthologous vs paralogous gene?
> I've got a simple script that needs to dump out all orthologs and paralogs, simplified here:
> my @genes = (get list of genes somehow);
> while (my $gene = shift @genes) {
>      my $orthologues = $gene->get_all_homologous_Genes;
>      while (my $o = shift @$orthologues) {
>          #print out the pair
>      }
> }
> Inside that inner while loop, I was just look at the the description ($o->[1]->description, in this case), and checking a regex against /ortholog/ or /paralog/ and printing out to the appropriate file that way. But I know that that's not quite right because I found a paralog pair that has a description of "putative_gene_split", which fails my handy little regex check.
> So what I wanted to know is - what's the proper way to differentiate between the two of them? My biology knowledge is incredibly light, but doing a little poking around looks like it might be as simple as just comparing the species of the two genes (the one I originally called get_all_homologous_Genes and the one I just shifted off the @$orthologues array). If it's the same species, they're a paralog, and if different an ortholog.
> Am I on the right track and is that sufficient? If not, what's the proper way to go about it?
> Many thanks,
> --
> -Jim Thomason...
> Scientific Informatics Developer @ The Ware Lab,
> a USDA-ARS Laboratory at Cold Spring Harbor Laboratory
> http://www.warelab.org/
> http://www.cshl.edu/
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

Javier Herrero, PhD
Ensembl Coordinator and Ensembl Compara Project Leader
European Bioinformatics Institute (EMBL-EBI)
Wellcome Trust Genome Campus, Hinxton
Cambridge - CB10 1SD - UK

More information about the Dev mailing list