[ensembl-dev] Homology

Javier Herrero jherrero at ebi.ac.uk
Thu Nov 17 15:37:45 GMT 2011


Hi James

I am glad it works for you now. Yes, starting from a new session is key 
as you were  probably using an old environment.

To get the homologues to a particular species, you can use the 
fetch_all_by_Member_paired_species 
<http://www.ensembl.org/info/docs/Doxygen/compara-api/classBio_1_1EnsEMBL_1_1Compara_1_1DBSQL_1_1HomologyAdaptor.html#abe833d3aacdf6854e9db032295c7da25>($member, 
"mus_musculus") method instead of fetch_all_by_Member().

I hope this helps

Javier

On 17/11/11 15:28, James Blackshaw wrote:
> Hello Javier,
> I'm unsure why, but it works now. It may be because I was trying to 
> run it in an SSH session I started before I installed the ensembl 
> package. all I need to do now is work out how to put in a filter to 
> only look for homologs in the two species I'm interested in.
>
> Thank you for the advice,
> -James
>
> On 16/11/2011 19:33, Javier Herrero wrote:
>> Hi James
>>
>> Sorry, I thought BioMart would have solved your problem. I guess it 
>> would be easier it we stick to just one script rather than the two 
>> you mentioned in an earlier email. I think I would opt for the first 
>> one (slides) as the other ones seems a little outdated (from the list).
>>
>> So, the script from the slides works out of the box (copy & paste 
>> from your email) for me, using e64 API.
>>
>> It also works if I substitute the gene stable ID by 
>> ENSDARG00000090113. However, if I use a non-existent gene stable ID 
>> or anything different from the gene stable ID, I get the error you 
>> reported in another email:
>> /Can't call method "dbID" on an undefined value at 
>> ~/ensembl-compara/modules/Bio/EnsEMBL/Compara/DBSQL/HomologyAdaptor.pm line 
>> 37./
>>
>> Please, use that script and check that it works (with the original 
>> gene identifier). Then change the identifier to get the data for your 
>> gene of interest and check that it still works.
>>
>> If you get a failure with the original script, there is something 
>> wrong with your setup (API installation, access to the database, 
>> etc). If you get a failure using a different gene identifier, then 
>> you are using a invalid gene identifier.
>>
>> I hope this helps
>>
>> Javier
>>
>> On 16/11/11 18:13, James Blackshaw wrote:
>>> Hello,
>>>
>>> Just a bit of a bump on this as I can't get the script to work and 
>>> I've checked that I am using version 64 of Homologene. I could 
>>> really do with a decent way to get homologs for a given species out 
>>> of Ensembl for a list of genes. Sadly Biomart isn't much use as I 
>>> cant get homologs from yeast or mouse to plants, they're on separate 
>>> filters. Does anyone on this list have a similar script they might 
>>> be willing to send me to see if I can get it to return anything?
>>>
>>> Regards,
>>> James
>>>>>> Hi,
>>>>>> this still doesn't work, I get the error:
>>>>>> "Can't call method "fetch_by_source_stable_id" on an undefined 
>>>>>> value at homology_workshop_getAllHomologuesForGene.pl line 28."
>>>>>>
>>>>>> I get the following error off the script for comparing all 
>>>>>> homologs between two species. A pity, as that is otherwise 
>>>>>> perfect for what I want.
>>>>>> "Can't call method "get_HomologyAdaptor" on an undefined value at 
>>>>>> homology_getAllHomologuesBetween2Species.pl line 24."
>>>>>>
>>>>>> Is there anywhere I'd be able to check the version number of the 
>>>>>> API I have?
>>>>>>
>>>>>> Regards,
>>>>>> -James
>>>>>>
>>>>>> On 14/11/2011 12:02, Matthieu Muffato wrote:
>>>>>>> Hi James
>>>>>>>
>>>>>>> I just hope there is no misunderstanding with the line numbers. 
>>>>>>> Your script seems to work well, at least here, replacing the 
>>>>>>> $homology_adaptor->fetch_all_by_Member($member) line by 
>>>>>>> $homology_adaptor->fetch_all_by_Member($member->gene_member)
>>>>>>>
>>>>>>> I can have the "dbID" error message when applying both changes. 
>>>>>>> They are actually not compatible. Either you specify the gene ID 
>>>>>>> in the first place, and stick with 
>>>>>>> $homology_adaptor->fetch_all_by_Member($member), or you keep the 
>>>>>>> protein ID, and you add the "->gene_member"
>>>>>>>
>>>>>>> There is a script in the scripts/examples directory of the 
>>>>>>> ensembl-compara repository, named 
>>>>>>> homology_workshop_getAllHomologuesForGene.pl, that does the job 
>>>>>>> if you know the gene name.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Matthieu
>>>>>>>
>>>>>>> On 14/11/11 11:41, James Blackshaw wrote:
>>>>>>>> Hi Matthieu,
>>>>>>>> Changing Line 16 doesn't help, I get the same error reported.
>>>>>>>> Changing line 25 gives me a new error though.
>>>>>>>> Can't call method "dbID" on an undefined value at
>>>>>>>> /usr/mbu/software/ensembl/ensembl-compara/modules/Bio/EnsEMBL/Compara/DBSQL/HomologyAdaptor.pm 
>>>>>>>>
>>>>>>>> line 37.
>>>>>>>>
>>>>>>>> What I'm trying to do is get the homologs for a list of 
>>>>>>>> proteins or
>>>>>>>> associated genes, either works. I'm surprised there's not 
>>>>>>>> already a
>>>>>>>> script about for it, but can't find on on the mailing list.
>>>>>>>>
>>>>>>>> -James
>>>>>>>>
>>>>>>>>
>>>>>>>> On 14/11/2011 11:25, Matthieu Muffato wrote:
>>>>>>>>> Hi James
>>>>>>>>>
>>>>>>>>> In Comparam the homologies are stored at the gene level. 
>>>>>>>>> Because your
>>>>>>>>> Member object represents a peptide, the homology list that you
>>>>>>>>> retrieve is empty.
>>>>>>>>>
>>>>>>>>> You can either change the line 25 to
>>>>>>>>> my $homologies =
>>>>>>>>> $homology_adaptor->fetch_all_by_Member($member->gene_member);
>>>>>>>>>
>>>>>>>>> or change the line 16 to
>>>>>>>>> my $member = 
>>>>>>>>> $member_adaptor->fetch_by_source_stable_id("ENSEMBLGENE",
>>>>>>>>> ****) with an Ensembl gene ID if you know it
>>>>>>>>>
>>>>>>>>> to have all the homologies of your favourite gene
>>>>>>>>>
>>>>>>>>> Hope this helps,
>>>>>>>>> Matthieu
>>>>>>>>>
>>>>>>>>> On 14/11/11 11:15, James Blackshaw wrote:
>>>>>>>>>> Hi,
>>>>>>>>>> I am using the code from the API installation page and that 
>>>>>>>>>> page talks
>>>>>>>>>> about version 64.
>>>>>>>>>> http://www.ensembl.org/info/docs/api/api_installation.html
>>>>>>>>>>
>>>>>>>>>> This is my output under verbose, looks like it is connecting.
>>>>>>>>>> Odd number of elements in hash assignment at
>>>>>>>>>> /usr/mbu/software/ensembl/ensembl/modules/Bio/EnsEMBL/Utils/Argument.pm 
>>>>>>>>>>
>>>>>>>>>> line 148.
>>>>>>>>>> Transcript:ENSDART00000112153 Gene:ENSDARG00000090113 Chr:19
>>>>>>>>>> Start:33519855 End:33522332
>>>>>>>>>> Can't call method "get_all_Member_Attribute" on an undefined 
>>>>>>>>>> value at
>>>>>>>>>> sandbox3.pl line 47.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> James
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 11/11/2011 20:20, Javier Herrero wrote:
>>>>>>>>>>> Hi James
>>>>>>>>>>>
>>>>>>>>>>> This error is typical when you fail to connect to the database.
>>>>>>>>>>>
>>>>>>>>>>> I suspect you are using the HEAD code instead of the branch 
>>>>>>>>>>> 64. The
>>>>>>>>>>> head code is already configured for connecting to the 
>>>>>>>>>>> forthcoming
>>>>>>>>>>> database. Please switch your code to the 64 branch and try 
>>>>>>>>>>> again:
>>>>>>>>>>>
>>>>>>>>>>> cvs up -r branch-ensembl-64
>>>>>>>>>>>
>>>>>>>>>>> Regards
>>>>>>>>>>>
>>>>>>>>>>> Javier
>>>>>>>>>>>
>>>>>>>>>>> On 11/11/11 19:29, Jan Vogel wrote:
>>>>>>>>>>>> Hi James,
>>>>>>>>>>>>
>>>>>>>>>>>> the first script from the powerpoint works for me w/o 
>>>>>>>>>>>> problems or
>>>>>>>>>>>> modifications. I've tried schema 62 and schema 64 API's.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> The second one ( 'from the list' ) had some minor problems. 
>>>>>>>>>>>> It works
>>>>>>>>>>>> like this :
>>>>>>>>>>>>
>>>>>>>>>>>> my $member =
>>>>>>>>>>>> $member_adaptor->fetch_by_source_stable_id('ENSEMBLGENE','ENSG00000004059'); 
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> or
>>>>>>>>>>>>
>>>>>>>>>>>> my $member =
>>>>>>>>>>>> $member_adaptor->fetch_by_source_stable_id(undef,'ENSG00000004059'); 
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I did not try the zfish protein.
>>>>>>>>>>>>
>>>>>>>>>>>> Hth,
>>>>>>>>>>>> Jan
>>>>>>>>>>>>
>>>>>>>>>>>> On Nov 11, 2011, at 9:50 AM, James Blackshaw wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>> I've been trying to put together some scripts for finding the
>>>>>>>>>>>>> homolgies
>>>>>>>>>>>>> for some lists of genes I'm interested in, but I keep 
>>>>>>>>>>>>> getting errors
>>>>>>>>>>>>> with the "fetch" methods.
>>>>>>>>>>>>>
>>>>>>>>>>>>> "Can't call method "fetch_by_source_stable_id" on an 
>>>>>>>>>>>>> undefined
>>>>>>>>>>>>> value at
>>>>>>>>>>>>> sandbox3.pl<http://sandbox3.pl/>  line 15."
>>>>>>>>>>>>>
>>>>>>>>>>>>> I've used one script taken from a presentation by Stephen
>>>>>>>>>>>>> Fitzgerald at
>>>>>>>>>>>>> Edinburgh, and another from this maining list. I'm 
>>>>>>>>>>>>> including both.
>>>>>>>>>>>>>
>>>>>>>>>>>>>  From the powerpoint:
>>>>>>>>>>>>> use strict;
>>>>>>>>>>>>> use Bio::EnsEMBL::Registry;
>>>>>>>>>>>>> my $reg = "Bio::EnsEMBL::Registry";
>>>>>>>>>>>>>
>>>>>>>>>>>>> $reg->load_registry_from_db(
>>>>>>>>>>>>> -host=>"ensembldb.ensembl.org<http://ensembldb.ensembl.org/>", 
>>>>>>>>>>>>>
>>>>>>>>>>>>> -user =>  "anonymous");
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> my $ma = $reg->get_adaptor(
>>>>>>>>>>>>> "Multi", "compara", "Member");
>>>>>>>>>>>>> my $member = $ma->fetch_by_source_stable_id(
>>>>>>>>>>>>> "ENSEMBLGENE", "ENSG00000000971");
>>>>>>>>>>>>>
>>>>>>>>>>>>> my $homology_adaptor = $reg->get_adaptor(
>>>>>>>>>>>>> "Multi", "compara", "Homology");
>>>>>>>>>>>>>
>>>>>>>>>>>>> my $homologies = $homology_adaptor->
>>>>>>>>>>>>> fetch_all_by_Member($member);
>>>>>>>>>>>>>
>>>>>>>>>>>>> foreach my $this_homology (@$homologies) {
>>>>>>>>>>>>> print $this_homology->description, "\n";
>>>>>>>>>>>>> my $member_attributes = $this_homology->
>>>>>>>>>>>>> get_all_Member_Attribute();
>>>>>>>>>>>>> foreach my $this_mem_attr (@$member_attributes) {
>>>>>>>>>>>>> my ($this_member, $this_attribute) =
>>>>>>>>>>>>> @$this_mem_attr;
>>>>>>>>>>>>> print $this_member->genome_db->name, " ",
>>>>>>>>>>>>> $this_member->source_name, " ",
>>>>>>>>>>>>> $this_member->stable_id, "\n";
>>>>>>>>>>>>> }
>>>>>>>>>>>>> print "\n";
>>>>>>>>>>>>> }
>>>>>>>>>>>>>
>>>>>>>>>>>>> ==========================================
>>>>>>>>>>>>>
>>>>>>>>>>>>>  From the list:
>>>>>>>>>>>>> use Bio::EnsEMBL::Registry;
>>>>>>>>>>>>> Bio::EnsEMBL::Registry->load_registry_from_db(
>>>>>>>>>>>>> -host =>  'ensembldb.ensembl.org',
>>>>>>>>>>>>> -user =>  'anonymous',
>>>>>>>>>>>>> -port =>  5306);
>>>>>>>>>>>>> my $member_adaptor = Bio::EnsEMBL::Registry->get_adaptor(
>>>>>>>>>>>>> 'Multi','compara','Member');
>>>>>>>>>>>>>
>>>>>>>>>>>>> # fetch a Member
>>>>>>>>>>>>> # get the MemberAdaptor
>>>>>>>>>>>>> my $member_adaptor =
>>>>>>>>>>>>> Bio::EnsEMBL::Registry->get_adaptor('Multi','compara','Member'); 
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> # fetch a Memmy $member =
>>>>>>>>>>>>> $member_adaptor->fetch_by_source_stable_id('ENSEMBLPROTEIN','ENSG00000004059'); 
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> my $member =
>>>>>>>>>>>>> $member_adaptor->fetch_by_source_stable_id('ENSEMBLPEP','ENSDARP00000103634'); 
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> # print out some information about the Member
>>>>>>>>>>>>> print $member->description, "\n";
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> my $homology_adaptor = 
>>>>>>>>>>>>> Bio::EnsEMBL::Registry->get_adaptor('Multi',
>>>>>>>>>>>>> 'compara', 'Homology');
>>>>>>>>>>>>> my $homologies = 
>>>>>>>>>>>>> $homology_adaptor->fetch_all_by_Member($member);
>>>>>>>>>>>>>
>>>>>>>>>>>>> # That will return a reference to an array with all
>>>>>>>>>>>>> homologies(orthologues in
>>>>>>>>>>>>> # other species and paralogues in the same one)
>>>>>>>>>>>>> # Then for each homology, you can get all the Members 
>>>>>>>>>>>>> implicated
>>>>>>>>>>>>>
>>>>>>>>>>>>> foreach my $homology (@{$homologies}) {
>>>>>>>>>>>>> # You will find different kind of description
>>>>>>>>>>>>> # UBRH, MBRH, RHS, YoungParalogues
>>>>>>>>>>>>> # see ensembl-compara/docs/docs/schema_doc.html for more 
>>>>>>>>>>>>> details
>>>>>>>>>>>>>
>>>>>>>>>>>>> print $homology->description," ", $homology->subtype,"\n";
>>>>>>>>>>>>> # And if they are defined dN and dS related values
>>>>>>>>>>>>> print " dn ", $homology->dn,"\n";
>>>>>>>>>>>>> print " ds ", $homology->ds,"\n";
>>>>>>>>>>>>> print " dnds_ratio ", $homology->dnds_ratio,"\n";
>>>>>>>>>>>>> }
>>>>>>>>>>>>>
>>>>>>>>>>>>> my $homology = $homologies->[0];
>>>>>>>>>>>>> # take one of the homologies and lookinto it
>>>>>>>>>>>>>
>>>>>>>>>>>>> foreach my $member_attribute
>>>>>>>>>>>>> (@{$homology->get_all_Member_Attribute}) {
>>>>>>>>>>>>>
>>>>>>>>>>>>> # for each Member, you get information on the Member 
>>>>>>>>>>>>> specifically
>>>>>>>>>>>>> and in
>>>>>>>>>>>>> # relation to the homology relation via Attribute object
>>>>>>>>>>>>>
>>>>>>>>>>>>> my ($member, $attribute) = @{$member_attribute};
>>>>>>>>>>>>> print (join " ", map { $member->$_ } qw(stable_id 
>>>>>>>>>>>>> taxon_id))."\n";
>>>>>>>>>>>>> print (join " ", map { $attribute->$_ } qw(perc_id
>>>>>>>>>>>>> perc_posperc_cov))."\n";
>>>>>>>>>>>>>
>>>>>>>>>>>>> }
>>>>>>>>>>>>> ====================================================
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> James Blackshaw
>>>>>>>>>>>>> PhD Student
>>>>>>>>>>>>> MRC Mitochondrial Biology Unit
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> Dev mailing list Dev at ensembl.org<mailto:Dev at ensembl.org>
>>>>>>>>>>>>> List admin (including subscribe/unsubscribe):
>>>>>>>>>>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>>>>>>>>>>> Ensembl Blog: http://www.ensembl.info/
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Dev mailing listDev at ensembl.org<mailto:Dev at ensembl.org>
>>>>>>>>>>>> List admin (including
>>>>>>>>>>>> subscribe/unsubscribe):http://lists.ensembl.org/mailman/listinfo/dev 
>>>>>>>>>>>>
>>>>>>>>>>>> Ensembl Blog:http://www.ensembl.info/
>>>>>>>>>>> -- 
>>>>>>>>>>> Javier Herrero, PhD
>>>>>>>>>>> Ensembl Compara Project Leader
>>>>>>>>>>> European Bioinformatics Institute (EMBL-EBI)
>>>>>>>>>>> Wellcome Trust Genome Campus, Hinxton
>>>>>>>>>>> Cambridge - CB10 1SD - UK
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Dev mailing listDev at ensembl.org<mailto:Dev at ensembl.org>
>>>>>>>>>>> List admin (including
>>>>>>>>>>> subscribe/unsubscribe):http://lists.ensembl.org/mailman/listinfo/dev 
>>>>>>>>>>>
>>>>>>>>>>> Ensembl Blog:http://www.ensembl.info/
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Dev mailing list Dev at ensembl.org
>>>>>>>>>> List admin (including subscribe/unsubscribe):
>>>>>>>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>>>>>>>> Ensembl Blog: http://www.ensembl.info/
>>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Dev mailing list Dev at ensembl.org
>>>>>> List admin (including subscribe/unsubscribe): 
>>>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>>>> Ensembl Blog: http://www.ensembl.info/
>>>>> ---
>>>>> Andrew Yates                   Ensembl Core Software Project Leader
>>>>> EMBL-EBI                       Tel: +44-(0)1223-492538
>>>>> Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
>>>>> Cambridge CB10 1SD, UK http://www.ensembl.org/
>>>>>
>>>>
>>>
>>>
>>> _______________________________________________
>>> Dev mailing list Dev at ensembl.org
>>> List admin (including subscribe/unsubscribe): 
>>> http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog: http://www.ensembl.info/
>>>
>>
>> -- 
>> Javier Herrero, PhD
>> Ensembl Compara Project Leader
>> European Bioinformatics Institute (EMBL-EBI)
>> Wellcome Trust Genome Campus, Hinxton
>> Cambridge - CB10 1SD - UK
>>
>>
>> _______________________________________________
>> Dev mailing listDev at ensembl.org
>> List admin (including subscribe/unsubscribe):http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog:http://www.ensembl.info/
>

-- 
Javier Herrero, PhD
Ensembl Compara Project Leader
European Bioinformatics Institute (EMBL-EBI)
Wellcome Trust Genome Campus, Hinxton
Cambridge - CB10 1SD - UK

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20111117/3d63526f/attachment.html>


More information about the Dev mailing list