[ensembl-dev] fetch all within species paralogous genes

shaohua fan shaohua.fan at uni-konstanz.de
Thu Apr 19 09:54:47 BST 2012


Dear Matthieu, 

sorry to bother you again. I did some reading of former emails and modified your script a little. But, it seems my modification is very limited that now I can only print out the paralogous genes pairwise, however, can not print them as a group. 

could you please have a look the script I modified? It will be great if you can give some hints that how to do this job between two genomes. 

Thanks a lot! 

Shaohua 

use Bio::EnsEMBL::Registry;
use Bio::SimpleAlign;
use Bio::AlignIO;
my $reg="Bio::EnsEMBL::Registry";
$reg->load_registry_from_db(
    -host=>"ensembldb.ensembl.org",
    -user => "anonymous");


my $mlss_adaptor = $reg->get_adaptor('Multi', 'compara', 'MethodLinkSpeciesSet');
my $mlss = $mlss_adaptor->fetch_by_method_link_type_registry_aliases("ENSEMBL_PARALOGUES", ["human"]);
my $homology_adaptor = $reg->get_adaptor("Multi", "compara", "Homology");
my $all_paralogues = $homology_adaptor->fetch_all_by_MethodLinkSpeciesSet_orthology_type($mlss, "within_species_paralog");


my $count;

while (my $homol = shift @{$all_paralogues}) {
	$count++;	
	print "group id $count\n";
	
	foreach my $member_attribute (@{$homol->get_all_Member_Attribute}) {
	my ($member, $attribute) = @{$member_attribute};
	print $member->stable_id;}
	print "\n";
					     }

On Apr 17, 2012, at 3:17 PM, Matthieu Muffato wrote:

> Dear Shaohua
> 
> The most efficient way is probably to use the fetch_all_by_MethodLinkSpeciesSet_orthology_type() method from the HomologyAdaptor
> 
> You first have to create a MethodLinkSpeciesSet object
> my $mlss_adaptor = $reg->get_adaptor("Multi", "compara", "MethodLinkSpeciesSet");
> my $mlss = $mlss_adaptor->fetch_by_method_link_type_registry_aliases("ENSEMBL_PARALOGUES", ["human"]);
> This object describes which kind of data you want, and on which species
> 
> Then, you can retrieve the homologies:
> my $homology_adaptor = $reg->get_adaptor("Multi", "compara", "Homology");
> my $all_paralogues = $homology_adaptor->fetch_all_by_MethodLinkSpeciesSet_orthology_type($mlss, "within_species_paralog");
> 
> This method can be extended to orthologues and to pairs of species. It is the fastest when it comes to fetch big sets of homologues
> 
> Hope this helps,
> Matthieu
> 
> On 17/04/12 10:45, shaohua fan wrote:
>> Dear all,
>> 
>> As a newbie of Ensembl compara API, I want to ask whether there is a way to fetch all the within species paralogous genes of one specified genome by using the compara API? I have searched the former emails and Ensembl website and only found the solution for the single gene in the tutorial of compara API.
>> 
>> Thanks a lot!
>> 
>> shaohua

-----------------------------------------------------------------------------------
Shaohua Fan, Ph.D student.
Professor Axel Meyer's Lab
Department of Biology (Fach M617)
University of Konstanz
Universitaetsstrasse 10
D-78457 Konstanz
Germany

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20120419/61f4010f/attachment.html>


More information about the Dev mailing list