[ensembl-dev] Gene Symbol

Andy Yates ayates at ebi.ac.uk
Mon Feb 13 21:05:25 GMT 2012


Hi Nick,

You would have to request the display_xref from the gene objects which will return an instance of DBEntry. So using the original method you would want to do something like:

my $name = 'CLK2';
my ($gene) = grep { $_->display_xref()->display_id() eq $name } @{$gene_adaptor->fetch_all_by_external_name($name, 'HGNC')};

Though as you said before using fetch_by_display_label() would achieve identical results

Andy

Andrew Yates                   Ensembl Core Software Project Leader
EMBL-EBI                       Tel: +44-(0)1223-492538
Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
Cambridge CB10 1SD, UK         http://www.ensembl.org/

On 13 Feb 2012, at 20:16, Nick Fankhauser wrote:

> Hi!
> 
> yes, you're right, it works! I just realized that many of my
> gene-symbols were not actually the official ones, but aliases. So I
> guess I have to translate those by hand, then it should work like this.
> 
> Or how could I get the display_label from the gene-adaptor, then it
> would also be possible using the other method, maybe?
> 
> Yours,
> Nick
> 
> On 13/02/12 21:04, Andy Yates wrote:
>> Hi Nick,
>> 
>> Apologies that method had slipped my mind & you are right you can use fetch_by_display_label() instead.
>> 
>> Andy
>> 
>> Andrew Yates                   Ensembl Core Software Project Leader
>> EMBL-EBI                       Tel: +44-(0)1223-492538
>> Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
>> Cambridge CB10 1SD, UK         http://www.ensembl.org/
>> 
>> On 13 Feb 2012, at 19:56, Nick Fankhauser wrote:
>> 
>>> Hi!
>>> 
>>> Thanks!
>>> But how can I get the display_label from a gene adaptor?
>>> 
>>> And couldn't I just have used fetch_by_display_label instead of
>>> fetch_all_by_external_name in the first place to retrieve the gene by
>>> gene-symbol?
>>> 
>>> Yours,
>>> Nick
>>> 
>>> On 13/02/12 18:05, Andy Yates wrote:
>>>> Hi Nick,
>>>> 
>>>> Again it's an issue with synonyms. If we take the case of SF3B1 we get two hits back:
>>>> 
>>>> http://www.ensembl.org/Homo_sapiens/Gene/Summary?g=ENSG00000115524;r=2:198256698-198299815
>>>> 
>>>> http://www.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000087365;r=11:65818200-65836779;t=ENST00000528302
>>>> 
>>>> The first is the intended record. The second is SF3B2 but this has a synonym for SF3b1 which is returned because our MySQL tables are case insensitive. If you add a check on the display label that should remove the remaining stragglers.
>>>> 
>>>> Andy
>>>> 
>>>> Andrew Yates                   Ensembl Core Software Project Leader
>>>> EMBL-EBI                       Tel: +44-(0)1223-492538
>>>> Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
>>>> Cambridge CB10 1SD, UK         http://www.ensembl.org/
>>>> 
>>>> On 13 Feb 2012, at 16:26, Nick Fankhauser wrote:
>>>> 
>>>>> Yes, thanks a lot! Like this it produces a lot less false hits.
>>>>> Especially when I combine it with rejecting all MHC chromosomes.
>>>>> 
>>>>> But there's still for example SF3B1 and RAGE, for which I get results
>>>>> from two different chromosomes for some reason. Do you know why this can
>>>>> still be the case?
>>>>> 
>>>>> Nick
>>>>> 
>>>>> 
>>>>> On 13/02/12 17:00, Andy Yates wrote:
>>>>>> Hi Nick,
>>>>>> 
>>>>>> My guess is that you're hitting an issue with external synonyms. The method you are using will consult all xrefs linked to a gene (along with transcripts and translations) as well as consulting the external_synonym table. In the case of CLK2 we have the following external synonyms linked to the term CLK2.
>>>>>> 
>>>>>> synonym	db_name	dbprimary_acc	display_label
>>>>>> clk2	Vega_transcript	OTTHUMT00000364143	OTTHUMT00000364143
>>>>>> clk2	Vega_transcript	OTTHUMT00000365664	RP11-531A21.3-001
>>>>>> clk2	Vega_transcript	OTTHUMT00000272912	OTTHUMT00000272912
>>>>>> clk2	OTTG	OTTHUMG00000150164	OTTHUMG00000150164
>>>>>> CLK2	EntrezGene	9894	TELO2
>>>>>> clk2	HGNC	2069	CLK2
>>>>>> 
>>>>>> If you change your query to limit by the external DB of the source then the hits will reduce massively e.g.
>>>>>> 
>>>>>> my $genes = $gene_adaptor->fetch_all_by_external_name('CLK2', 'HGNC');
>>>>>> 
>>>>>> All the best
>>>>>> 
>>>>>> Andy
>>>>>> 
>>>>>> On 13 Feb 2012, at 15:22, Nick Fankhauser wrote:
>>>>>> 
>>>>>>> Hi!
>>>>>>> 
>>>>>>> I'm trying to retrieve the chromosomal position for a list of
>>>>>>> gene-symbols. They are all official gene-symbols.
>>>>>>> 
>>>>>>> Using a loop like this
>>>>>>> 
>>>>>>> foreach my $gene
>>>>>>> (@{$gene_adaptor->fetch_all_by_external_name($gene_symbol)}) {
>>>>>>> 
>>>>>>> I get one correct hit for some genes (e.g. USP8), but for others like
>>>>>>> CLK2, I get multiple results and have no idea how to select the correct
>>>>>>> one. Is there a way to just get position of just the official gene symbol?
>>>>>>> 
>>>>>>> Thanks!
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> Dev mailing list    Dev at ensembl.org
>>>>>>> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
>>>>>>> Ensembl Blog: http://www.ensembl.info/
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> Dev mailing list    Dev at ensembl.org
>>>>>> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
>>>>>> Ensembl Blog: http://www.ensembl.info/
>>>> 
>> 





More information about the Dev mailing list