[ensembl-dev] Wrong gene description

Andy Yates ayates at ebi.ac.uk
Tue Oct 15 10:17:09 BST 2013


Hi Sébastien,

Ensembl's mapping to Xenbase come from the following file:

ftp://ftp.xenbase.org/pub/GenePageReports/GenePageEnsemblModelMapping.txt

If you look for the record associated with ENSXETG00000008325 you can see the full gene name (which we are using as a description):

Putative ortholog of galactosylgalactosylxylosylprotein 3-beta-glucuronosyltransferase 1 (EC 2.4.1.135) (Beta-1,3-glucuronyltransferase 1) (Glucuronosyltransferase-P) (GlcAT-P) (UDP-GlcUA:glycoprotein beta- 1,3-glucuronyltransferase) (GlcUAT-P). [Source:Uniprot/SWISSPROT;Acc:, 1 of 1

The extra "[Source:" section seen in our website is from Xenbase and not added by our pipelines (we do add the Source:Jamboree though). However this micro-format does scream Ensembl. It suggests that Xenbase has imported annotation from us and have not fully stripped this section. I have submitted an issue to Xenbase to see if they can clean up the record.

Andy

------------
Andrew Yates - Ensembl Core Software Project Leader
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
Tel: +44-(0)1223-492538
Fax: +44-(0)1223-494468
http://www.ensembl.org/

On 15 Oct 2013, at 08:50, Sébastien Moretti <sebastien.moretti at unil.ch> wrote:

> Hi Magali,
> 
> 
>> Hi Sebastien,
>> 
>> Xenbase uses two types of information to describe their gene models.
>> 
>> The first one is the gene symbol, which corresponds to what we in
>> Ensembl would call a display name.
>> The second one is the gene name, which we import as a gene description.
>> 
>> For the entry you are referring to, the gene symbol is 'unnamed', as
>> Xenbase have not manually annotated that gene.
>> This is what the 'Source:UniProt/SWISSPROT' part of the description
>> refers to.
>> As this entry as not yet been fully annotated, the description has been
>> imported from Uniprot.
>> 
>> We import our xenopus annotations directly from Xenbase, so rely on
>> their knowledge in the matter.
> 
> According to Xenbase, the "1 of 1" in [Source:UniProt/SWISSPROT;Acc:, 1 of 1 [Source:Jamboree;Acc:XB-GENE-961613] should be part of the ensembl description core, not in the source part, at the end.
> And ". [Source:UniProt/SWISSPROT;Acc:" is useless.
> 
>> As for the space, I believe there is none between 'beta' and '1,3', but
>> the rendering makes it look like it does.
>> Looking the ID up on our rest service will show you the exact
>> description, without additional rendering.
>> http://beta.rest.ensembl.org/xrefs/id/ENSXETG00000008325?content-type=application/json
>> 
>> 
>> Hope that helps,
>> Magali
>> 
>> On 14/10/13 14:54, Sébastien Moretti wrote:
>>> Hi
>>> 
>>> it looks there is a problem with gene description of
>>> ENSXETG00000008325, at least since release 68.
>>> 
>>> The source at the end of the description is
>>>  [Source:UniProt/SWISSPROT;Acc:, 1 of 1
>>> [Source:Jamboree;Acc:XB-GENE-961613]
>>> and should be
>>>  [Source:Jamboree;Acc:XB-GENE-961613]
>>> "1 of 1" is part of the gene name according to Xenbase.
>>> 
>>> Also, "beta-" & "1,3-glucuronyltransferase" should not be separated by
>>> a space I think.
>>> 
>>> 
>>> 
>>> Putative ortholog of galactosylgalactosylxylosylprotein
>>> 3-beta-glucuronosyltransferase 1 (EC 2.4.1.135)
>>> (Beta-1,3-glucuronyltransferase 1) (Glucuronosyltransferase-P)
>>> (GlcAT-P) (UDP-GlcUA:glycoprotein beta-1,3-glucuronyltransferase)
>>> (GlcUAT-P), 1 of 1 [Source:Jamboree;Acc:XB-GENE-961613]
>>> 
>>> Regards
> 
> -- 
> Sébastien Moretti
> Department of Ecology and Evolution,
> Biophore, University of Lausanne,
> CH-1015 Lausanne, Switzerland
> Tel.: +41 (21) 692 4221/4079
> http://selectome.unil.ch/ http://bgee.unil.ch/
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/





More information about the Dev mailing list