[ensembl-dev] Gene Symbol

Nick Fankhauser lists at nyk.ch
Mon Feb 13 19:56:26 GMT 2012


Hi!

Thanks!
But how can I get the display_label from a gene adaptor?

And couldn't I just have used fetch_by_display_label instead of
fetch_all_by_external_name in the first place to retrieve the gene by
gene-symbol?

Yours,
Nick

On 13/02/12 18:05, Andy Yates wrote:
> Hi Nick,
> 
> Again it's an issue with synonyms. If we take the case of SF3B1 we get two hits back:
> 
> http://www.ensembl.org/Homo_sapiens/Gene/Summary?g=ENSG00000115524;r=2:198256698-198299815
> 
> http://www.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000087365;r=11:65818200-65836779;t=ENST00000528302
> 
> The first is the intended record. The second is SF3B2 but this has a synonym for SF3b1 which is returned because our MySQL tables are case insensitive. If you add a check on the display label that should remove the remaining stragglers.
> 
> Andy
> 
> Andrew Yates                   Ensembl Core Software Project Leader
> EMBL-EBI                       Tel: +44-(0)1223-492538
> Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
> Cambridge CB10 1SD, UK         http://www.ensembl.org/
> 
> On 13 Feb 2012, at 16:26, Nick Fankhauser wrote:
> 
>> Yes, thanks a lot! Like this it produces a lot less false hits.
>> Especially when I combine it with rejecting all MHC chromosomes.
>>
>> But there's still for example SF3B1 and RAGE, for which I get results
>> from two different chromosomes for some reason. Do you know why this can
>> still be the case?
>>
>> Nick
>>
>>
>> On 13/02/12 17:00, Andy Yates wrote:
>>> Hi Nick,
>>>
>>> My guess is that you're hitting an issue with external synonyms. The method you are using will consult all xrefs linked to a gene (along with transcripts and translations) as well as consulting the external_synonym table. In the case of CLK2 we have the following external synonyms linked to the term CLK2.
>>>
>>> synonym	db_name	dbprimary_acc	display_label
>>> clk2	Vega_transcript	OTTHUMT00000364143	OTTHUMT00000364143
>>> clk2	Vega_transcript	OTTHUMT00000365664	RP11-531A21.3-001
>>> clk2	Vega_transcript	OTTHUMT00000272912	OTTHUMT00000272912
>>> clk2	OTTG	OTTHUMG00000150164	OTTHUMG00000150164
>>> CLK2	EntrezGene	9894	TELO2
>>> clk2	HGNC	2069	CLK2
>>>
>>> If you change your query to limit by the external DB of the source then the hits will reduce massively e.g.
>>>
>>> my $genes = $gene_adaptor->fetch_all_by_external_name('CLK2', 'HGNC');
>>>
>>> All the best
>>>
>>> Andy
>>>
>>> On 13 Feb 2012, at 15:22, Nick Fankhauser wrote:
>>>
>>>> Hi!
>>>>
>>>> I'm trying to retrieve the chromosomal position for a list of
>>>> gene-symbols. They are all official gene-symbols.
>>>>
>>>> Using a loop like this
>>>>
>>>>   foreach my $gene
>>>> (@{$gene_adaptor->fetch_all_by_external_name($gene_symbol)}) {
>>>>
>>>> I get one correct hit for some genes (e.g. USP8), but for others like
>>>> CLK2, I get multiple results and have no idea how to select the correct
>>>> one. Is there a way to just get position of just the official gene symbol?
>>>>
>>>> Thanks!
>>>>
>>>> _______________________________________________
>>>> Dev mailing list    Dev at ensembl.org
>>>> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
>>>> Ensembl Blog: http://www.ensembl.info/
>>>
>>>
>>> _______________________________________________
>>> Dev mailing list    Dev at ensembl.org
>>> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog: http://www.ensembl.info/
> 




More information about the Dev mailing list