[ensembl-dev] Homology

James Blackshaw jab250 at mrc-mbu.cam.ac.uk
Thu Nov 17 15:28:20 GMT 2011


Hello Javier,
I'm unsure why, but it works now. It may be because I was trying to run 
it in an SSH session I started before I installed the ensembl package. 
all I need to do now is work out how to put in a filter to only look for 
homologs in the two species I'm interested in.

Thank you for the advice,
-James

On 16/11/2011 19:33, Javier Herrero wrote:
> Hi James
>
> Sorry, I thought BioMart would have solved your problem. I guess it 
> would be easier it we stick to just one script rather than the two you 
> mentioned in an earlier email. I think I would opt for the first one 
> (slides) as the other ones seems a little outdated (from the list).
>
> So, the script from the slides works out of the box (copy & paste from 
> your email) for me, using e64 API.
>
> It also works if I substitute the gene stable ID by 
> ENSDARG00000090113. However, if I use a non-existent gene stable ID or 
> anything different from the gene stable ID, I get the error you 
> reported in another email:
> /Can't call method "dbID" on an undefined value at 
> ~/ensembl-compara/modules/Bio/EnsEMBL/Compara/DBSQL/HomologyAdaptor.pm 
> line 37./
>
> Please, use that script and check that it works (with the original 
> gene identifier). Then change the identifier to get the data for your 
> gene of interest and check that it still works.
>
> If you get a failure with the original script, there is something 
> wrong with your setup (API installation, access to the database, etc). 
> If you get a failure using a different gene identifier, then you are 
> using a invalid gene identifier.
>
> I hope this helps
>
> Javier
>
> On 16/11/11 18:13, James Blackshaw wrote:
>> Hello,
>>
>> Just a bit of a bump on this as I can't get the script to work and 
>> I've checked that I am using version 64 of Homologene. I could really 
>> do with a decent way to get homologs for a given species out of 
>> Ensembl for a list of genes. Sadly Biomart isn't much use as I cant 
>> get homologs from yeast or mouse to plants, they're on separate 
>> filters. Does anyone on this list have a similar script they might be 
>> willing to send me to see if I can get it to return anything?
>>
>> Regards,
>> James
>>>>> Hi,
>>>>> this still doesn't work, I get the error:
>>>>> "Can't call method "fetch_by_source_stable_id" on an undefined 
>>>>> value at homology_workshop_getAllHomologuesForGene.pl line 28."
>>>>>
>>>>> I get the following error off the script for comparing all 
>>>>> homologs between two species. A pity, as that is otherwise perfect 
>>>>> for what I want.
>>>>> "Can't call method "get_HomologyAdaptor" on an undefined value at 
>>>>> homology_getAllHomologuesBetween2Species.pl line 24."
>>>>>
>>>>> Is there anywhere I'd be able to check the version number of the 
>>>>> API I have?
>>>>>
>>>>> Regards,
>>>>> -James
>>>>>
>>>>> On 14/11/2011 12:02, Matthieu Muffato wrote:
>>>>>> Hi James
>>>>>>
>>>>>> I just hope there is no misunderstanding with the line numbers. 
>>>>>> Your script seems to work well, at least here, replacing the 
>>>>>> $homology_adaptor->fetch_all_by_Member($member) line by 
>>>>>> $homology_adaptor->fetch_all_by_Member($member->gene_member)
>>>>>>
>>>>>> I can have the "dbID" error message when applying both changes. 
>>>>>> They are actually not compatible. Either you specify the gene ID 
>>>>>> in the first place, and stick with 
>>>>>> $homology_adaptor->fetch_all_by_Member($member), or you keep the 
>>>>>> protein ID, and you add the "->gene_member"
>>>>>>
>>>>>> There is a script in the scripts/examples directory of the 
>>>>>> ensembl-compara repository, named 
>>>>>> homology_workshop_getAllHomologuesForGene.pl, that does the job 
>>>>>> if you know the gene name.
>>>>>>
>>>>>> Regards,
>>>>>> Matthieu
>>>>>>
>>>>>> On 14/11/11 11:41, James Blackshaw wrote:
>>>>>>> Hi Matthieu,
>>>>>>> Changing Line 16 doesn't help, I get the same error reported.
>>>>>>> Changing line 25 gives me a new error though.
>>>>>>> Can't call method "dbID" on an undefined value at
>>>>>>> /usr/mbu/software/ensembl/ensembl-compara/modules/Bio/EnsEMBL/Compara/DBSQL/HomologyAdaptor.pm 
>>>>>>>
>>>>>>> line 37.
>>>>>>>
>>>>>>> What I'm trying to do is get the homologs for a list of proteins or
>>>>>>> associated genes, either works. I'm surprised there's not already a
>>>>>>> script about for it, but can't find on on the mailing list.
>>>>>>>
>>>>>>> -James
>>>>>>>
>>>>>>>
>>>>>>> On 14/11/2011 11:25, Matthieu Muffato wrote:
>>>>>>>> Hi James
>>>>>>>>
>>>>>>>> In Comparam the homologies are stored at the gene level. 
>>>>>>>> Because your
>>>>>>>> Member object represents a peptide, the homology list that you
>>>>>>>> retrieve is empty.
>>>>>>>>
>>>>>>>> You can either change the line 25 to
>>>>>>>> my $homologies =
>>>>>>>> $homology_adaptor->fetch_all_by_Member($member->gene_member);
>>>>>>>>
>>>>>>>> or change the line 16 to
>>>>>>>> my $member = 
>>>>>>>> $member_adaptor->fetch_by_source_stable_id("ENSEMBLGENE",
>>>>>>>> ****) with an Ensembl gene ID if you know it
>>>>>>>>
>>>>>>>> to have all the homologies of your favourite gene
>>>>>>>>
>>>>>>>> Hope this helps,
>>>>>>>> Matthieu
>>>>>>>>
>>>>>>>> On 14/11/11 11:15, James Blackshaw wrote:
>>>>>>>>> Hi,
>>>>>>>>> I am using the code from the API installation page and that 
>>>>>>>>> page talks
>>>>>>>>> about version 64.
>>>>>>>>> http://www.ensembl.org/info/docs/api/api_installation.html
>>>>>>>>>
>>>>>>>>> This is my output under verbose, looks like it is connecting.
>>>>>>>>> Odd number of elements in hash assignment at
>>>>>>>>> /usr/mbu/software/ensembl/ensembl/modules/Bio/EnsEMBL/Utils/Argument.pm 
>>>>>>>>>
>>>>>>>>> line 148.
>>>>>>>>> Transcript:ENSDART00000112153 Gene:ENSDARG00000090113 Chr:19
>>>>>>>>> Start:33519855 End:33522332
>>>>>>>>> Can't call method "get_all_Member_Attribute" on an undefined 
>>>>>>>>> value at
>>>>>>>>> sandbox3.pl line 47.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> James
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 11/11/2011 20:20, Javier Herrero wrote:
>>>>>>>>>> Hi James
>>>>>>>>>>
>>>>>>>>>> This error is typical when you fail to connect to the database.
>>>>>>>>>>
>>>>>>>>>> I suspect you are using the HEAD code instead of the branch 
>>>>>>>>>> 64. The
>>>>>>>>>> head code is already configured for connecting to the 
>>>>>>>>>> forthcoming
>>>>>>>>>> database. Please switch your code to the 64 branch and try 
>>>>>>>>>> again:
>>>>>>>>>>
>>>>>>>>>> cvs up -r branch-ensembl-64
>>>>>>>>>>
>>>>>>>>>> Regards
>>>>>>>>>>
>>>>>>>>>> Javier
>>>>>>>>>>
>>>>>>>>>> On 11/11/11 19:29, Jan Vogel wrote:
>>>>>>>>>>> Hi James,
>>>>>>>>>>>
>>>>>>>>>>> the first script from the powerpoint works for me w/o 
>>>>>>>>>>> problems or
>>>>>>>>>>> modifications. I've tried schema 62 and schema 64 API's.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> The second one ( 'from the list' ) had some minor problems. 
>>>>>>>>>>> It works
>>>>>>>>>>> like this :
>>>>>>>>>>>
>>>>>>>>>>> my $member =
>>>>>>>>>>> $member_adaptor->fetch_by_source_stable_id('ENSEMBLGENE','ENSG00000004059'); 
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> or
>>>>>>>>>>>
>>>>>>>>>>> my $member =
>>>>>>>>>>> $member_adaptor->fetch_by_source_stable_id(undef,'ENSG00000004059'); 
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I did not try the zfish protein.
>>>>>>>>>>>
>>>>>>>>>>> Hth,
>>>>>>>>>>> Jan
>>>>>>>>>>>
>>>>>>>>>>> On Nov 11, 2011, at 9:50 AM, James Blackshaw wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>> I've been trying to put together some scripts for finding the
>>>>>>>>>>>> homolgies
>>>>>>>>>>>> for some lists of genes I'm interested in, but I keep 
>>>>>>>>>>>> getting errors
>>>>>>>>>>>> with the "fetch" methods.
>>>>>>>>>>>>
>>>>>>>>>>>> "Can't call method "fetch_by_source_stable_id" on an undefined
>>>>>>>>>>>> value at
>>>>>>>>>>>> sandbox3.pl<http://sandbox3.pl/>  line 15."
>>>>>>>>>>>>
>>>>>>>>>>>> I've used one script taken from a presentation by Stephen
>>>>>>>>>>>> Fitzgerald at
>>>>>>>>>>>> Edinburgh, and another from this maining list. I'm 
>>>>>>>>>>>> including both.
>>>>>>>>>>>>
>>>>>>>>>>>>  From the powerpoint:
>>>>>>>>>>>> use strict;
>>>>>>>>>>>> use Bio::EnsEMBL::Registry;
>>>>>>>>>>>> my $reg = "Bio::EnsEMBL::Registry";
>>>>>>>>>>>>
>>>>>>>>>>>> $reg->load_registry_from_db(
>>>>>>>>>>>> -host=>"ensembldb.ensembl.org<http://ensembldb.ensembl.org/>",
>>>>>>>>>>>> -user =>  "anonymous");
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> my $ma = $reg->get_adaptor(
>>>>>>>>>>>> "Multi", "compara", "Member");
>>>>>>>>>>>> my $member = $ma->fetch_by_source_stable_id(
>>>>>>>>>>>> "ENSEMBLGENE", "ENSG00000000971");
>>>>>>>>>>>>
>>>>>>>>>>>> my $homology_adaptor = $reg->get_adaptor(
>>>>>>>>>>>> "Multi", "compara", "Homology");
>>>>>>>>>>>>
>>>>>>>>>>>> my $homologies = $homology_adaptor->
>>>>>>>>>>>> fetch_all_by_Member($member);
>>>>>>>>>>>>
>>>>>>>>>>>> foreach my $this_homology (@$homologies) {
>>>>>>>>>>>> print $this_homology->description, "\n";
>>>>>>>>>>>> my $member_attributes = $this_homology->
>>>>>>>>>>>> get_all_Member_Attribute();
>>>>>>>>>>>> foreach my $this_mem_attr (@$member_attributes) {
>>>>>>>>>>>> my ($this_member, $this_attribute) =
>>>>>>>>>>>> @$this_mem_attr;
>>>>>>>>>>>> print $this_member->genome_db->name, " ",
>>>>>>>>>>>> $this_member->source_name, " ",
>>>>>>>>>>>> $this_member->stable_id, "\n";
>>>>>>>>>>>> }
>>>>>>>>>>>> print "\n";
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> ==========================================
>>>>>>>>>>>>
>>>>>>>>>>>>  From the list:
>>>>>>>>>>>> use Bio::EnsEMBL::Registry;
>>>>>>>>>>>> Bio::EnsEMBL::Registry->load_registry_from_db(
>>>>>>>>>>>> -host =>  'ensembldb.ensembl.org',
>>>>>>>>>>>> -user =>  'anonymous',
>>>>>>>>>>>> -port =>  5306);
>>>>>>>>>>>> my $member_adaptor = Bio::EnsEMBL::Registry->get_adaptor(
>>>>>>>>>>>> 'Multi','compara','Member');
>>>>>>>>>>>>
>>>>>>>>>>>> # fetch a Member
>>>>>>>>>>>> # get the MemberAdaptor
>>>>>>>>>>>> my $member_adaptor =
>>>>>>>>>>>> Bio::EnsEMBL::Registry->get_adaptor('Multi','compara','Member'); 
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> # fetch a Memmy $member =
>>>>>>>>>>>> $member_adaptor->fetch_by_source_stable_id('ENSEMBLPROTEIN','ENSG00000004059'); 
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> my $member =
>>>>>>>>>>>> $member_adaptor->fetch_by_source_stable_id('ENSEMBLPEP','ENSDARP00000103634'); 
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> # print out some information about the Member
>>>>>>>>>>>> print $member->description, "\n";
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> my $homology_adaptor = 
>>>>>>>>>>>> Bio::EnsEMBL::Registry->get_adaptor('Multi',
>>>>>>>>>>>> 'compara', 'Homology');
>>>>>>>>>>>> my $homologies = 
>>>>>>>>>>>> $homology_adaptor->fetch_all_by_Member($member);
>>>>>>>>>>>>
>>>>>>>>>>>> # That will return a reference to an array with all
>>>>>>>>>>>> homologies(orthologues in
>>>>>>>>>>>> # other species and paralogues in the same one)
>>>>>>>>>>>> # Then for each homology, you can get all the Members 
>>>>>>>>>>>> implicated
>>>>>>>>>>>>
>>>>>>>>>>>> foreach my $homology (@{$homologies}) {
>>>>>>>>>>>> # You will find different kind of description
>>>>>>>>>>>> # UBRH, MBRH, RHS, YoungParalogues
>>>>>>>>>>>> # see ensembl-compara/docs/docs/schema_doc.html for more 
>>>>>>>>>>>> details
>>>>>>>>>>>>
>>>>>>>>>>>> print $homology->description," ", $homology->subtype,"\n";
>>>>>>>>>>>> # And if they are defined dN and dS related values
>>>>>>>>>>>> print " dn ", $homology->dn,"\n";
>>>>>>>>>>>> print " ds ", $homology->ds,"\n";
>>>>>>>>>>>> print " dnds_ratio ", $homology->dnds_ratio,"\n";
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> my $homology = $homologies->[0];
>>>>>>>>>>>> # take one of the homologies and lookinto it
>>>>>>>>>>>>
>>>>>>>>>>>> foreach my $member_attribute
>>>>>>>>>>>> (@{$homology->get_all_Member_Attribute}) {
>>>>>>>>>>>>
>>>>>>>>>>>> # for each Member, you get information on the Member 
>>>>>>>>>>>> specifically
>>>>>>>>>>>> and in
>>>>>>>>>>>> # relation to the homology relation via Attribute object
>>>>>>>>>>>>
>>>>>>>>>>>> my ($member, $attribute) = @{$member_attribute};
>>>>>>>>>>>> print (join " ", map { $member->$_ } qw(stable_id 
>>>>>>>>>>>> taxon_id))."\n";
>>>>>>>>>>>> print (join " ", map { $attribute->$_ } qw(perc_id
>>>>>>>>>>>> perc_posperc_cov))."\n";
>>>>>>>>>>>>
>>>>>>>>>>>> }
>>>>>>>>>>>> ====================================================
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> James Blackshaw
>>>>>>>>>>>> PhD Student
>>>>>>>>>>>> MRC Mitochondrial Biology Unit
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Dev mailing list Dev at ensembl.org<mailto:Dev at ensembl.org>
>>>>>>>>>>>> List admin (including subscribe/unsubscribe):
>>>>>>>>>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>>>>>>>>>> Ensembl Blog: http://www.ensembl.info/
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Dev mailing listDev at ensembl.org<mailto:Dev at ensembl.org>
>>>>>>>>>>> List admin (including
>>>>>>>>>>> subscribe/unsubscribe):http://lists.ensembl.org/mailman/listinfo/dev 
>>>>>>>>>>>
>>>>>>>>>>> Ensembl Blog:http://www.ensembl.info/
>>>>>>>>>> -- 
>>>>>>>>>> Javier Herrero, PhD
>>>>>>>>>> Ensembl Compara Project Leader
>>>>>>>>>> European Bioinformatics Institute (EMBL-EBI)
>>>>>>>>>> Wellcome Trust Genome Campus, Hinxton
>>>>>>>>>> Cambridge - CB10 1SD - UK
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Dev mailing listDev at ensembl.org<mailto:Dev at ensembl.org>
>>>>>>>>>> List admin (including
>>>>>>>>>> subscribe/unsubscribe):http://lists.ensembl.org/mailman/listinfo/dev 
>>>>>>>>>>
>>>>>>>>>> Ensembl Blog:http://www.ensembl.info/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Dev mailing list Dev at ensembl.org
>>>>>>>>> List admin (including subscribe/unsubscribe):
>>>>>>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>>>>>>> Ensembl Blog: http://www.ensembl.info/
>>>>>>>>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Dev mailing list Dev at ensembl.org
>>>>> List admin (including subscribe/unsubscribe): 
>>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>>> Ensembl Blog: http://www.ensembl.info/
>>>> ---
>>>> Andrew Yates                   Ensembl Core Software Project Leader
>>>> EMBL-EBI                       Tel: +44-(0)1223-492538
>>>> Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
>>>> Cambridge CB10 1SD, UK http://www.ensembl.org/
>>>>
>>>
>>
>>
>> _______________________________________________
>> Dev mailing list Dev at ensembl.org
>> List admin (including subscribe/unsubscribe): 
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>
> -- 
> Javier Herrero, PhD
> Ensembl Compara Project Leader
> European Bioinformatics Institute (EMBL-EBI)
> Wellcome Trust Genome Campus, Hinxton
> Cambridge - CB10 1SD - UK
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20111117/c26c8b96/attachment.html>


More information about the Dev mailing list