[ensembl-dev] Choosing which protein to do ortholog compare

Michael Schuster michaels at ebi.ac.uk
Fri Jul 29 16:02:22 BST 2011


If I may add here, this comparison isn't handled at the level of  
Ensembl Compara, but within the remit of the CCDS collaboration of  
which UCSC, NCBI, Havana, Ensembl and also UniProt are members of.

In this particular case good biological evidence for the long isoform  
of Adamts9 in mouse is somewhat lacking. A full-length cDNA does not  
seem available and the EST coverage towards the 5' end of the gene  
(purple track on the Location-View page linked below) is somewhat  
underwhelming. As far as I can see, UniProt does also not seem to have  
a manually annotated (reviewed) protein record for mouse Adamts9 either.

http://www.ensembl.org/Mus_musculus/Location/View?db=core;g=ENSMUSG00000030022;r=6:92707901-92896048;t=ENSMUST00000113438;contigviewbottom=dna_align_otherfeatures_mouse_est=unlimited

The transcript Adamts9-001 (ENSMUST00000113438) is based on manual  
curation by Havana (OTTMUST00000063612), which 5' end, in turn is  
based on human cDNA AF488803.

http://www.ensembl.org/Mus_musculus/Transcript/Summary?db=core;g=ENSMUSG00000030022;r=6:92707901-92896048;t=ENSMUST00000113438

http://vega.sanger.ac.uk/Mus_musculus/Transcript/Summary?g=OTTMUSG00000025821;r=6:92722693-92893486;t=OTTMUST00000063612

http://vega.sanger.ac.uk/Mus_musculus/Transcript/SupportingEvidence?g=OTTMUSG00000025821;r=6:92722693-92893486;t=OTTMUST00000063612

http://www.ebi.ac.uk/ena/data/view/AF488803.1

The upshot is that RefSeq currently has no representation for this  
long isoform and therefore, no CCDS representative is available.   
RefSeq NM_175314 covers the short isoform to the extent ESTs support  
it, which is associated with CCDS51855 and transcript Adamts9-201  
(ENSMUST00000167391) that has been selected for the gene tree analysis.

Best regards,
Michael Schuster


On 28 Jul 2011, at 15:43, Southan, Christopher wrote:

> Is there any step in the Compara system that can check the
> UniProt-SwissProt (manual curation of the longest ORF) against the  
> CCDS
> and or what you select for the Ortholgue assignments ?
>
> Yours, Chris
>
>
> --------------------------------------------------------------------------
> Confidentiality Notice: This message is private and may contain  
> confidential and proprietary information. If you have received this  
> message in error, please notify us and remove it from your system  
> and note that you must not copy, distribute or take any action in  
> reliance on it. Any unauthorized use or disclosure of the contents  
> of this message is not permitted and may be unlawful.
>
> -----Original Message-----
> From: dev-bounces at ensembl.org [mailto:dev-bounces at ensembl.org] On  
> Behalf
> Of Javier Herrero
> Sent: den 28 juli 2011 16:37
> To: dev at ensembl.org; Richard Chirko
> Subject: Re: [ensembl-dev] Choosing which protein to do ortholog  
> compare
>
> Dear Dick
>
> We generally use the longest protein, but we favour the CCDS  
> entries. In
> the
> case of Adamts9 gene, the longest protein product is not part of the
> CCDS
> while the second longest is. In those case, we choose the CCDS entry.
>
> I hope this helps
>
> Javier
>
> On Thursday 28 Jul 2011 15:01:23 Richard Chirko wrote:
>> Mouse gene Adamts9 shows 2 protein products. One is 1931 AA and the
> other
>> is 1350 AA. When I click on the Orthologues link, all the comparisons
> are
>> to the 1350 product. Yet virtually all the proteins they are compared
> to
>> are in the 1800 to 1950 range. How do you choose which protein  
>> product
> to
>> use for the comparison and is there a way to get (from your site) the
>> orthologues for the 1931 AA product?
>> Many thanks for your help in this.
>>
>> Dick
>
> -- 
> Javier Herrero, PhD
> Ensembl Compara Project Leader
> European Bioinformatics Institute (EMBL-EBI)
> Wellcome Trust Genome Campus, Hinxton
> Cambridge - CB10 1SD - UK
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe):
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

--
Michael Schuster, Ph.D.
Ensembl Genome Browser Project
Vertebrate Genomics Team
EMBL - European Bioinformatics Institute
Wellcome Trust Genome Campus, Hinxton
Cambridge CB10 1SD
United Kingdom

http://www.ensembl.org/







More information about the Dev mailing list