[ensembl-dev] how to calculate transcript length ?

enrico1970 at yahoo.com enrico1970 at yahoo.com
Mon Aug 27 23:56:40 BST 2012


Dear Thibaut and Jay,
I really appreciate your suggestion to the list  but I have a similar query to the one of Gang
At the page http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000089234;r=12:112080797-112123790;t=ENST00000419234

The transcript ENST00000419234 has a length of 3179 nucleotides, that corresponds to the sum of its exons, the protein is expected to have 3179/3=1059 amino acids 

but it has only 592 amino acids, the same phenomenon happen for other trancripts.

What is the data definition of the length of the transcript and of the protein?
Kind regards,

Enrico Rubagotti





Hi,
I would recommend you to use the Perl API for all the information you 
want to retrieve from Ensembl.
It's quite easy to use and if there is a schema change, you will not 
have to change all your SQL queries.
Here is some documentation: http://www.ensembl.org/info/docs/api/core/core_tutorial.html http://www.ensembl.org/info/docs/Doxygen/core-api/index.html http://www.ensembl.org/info/docs/api/index.html Here is some code for your query: use Bio::EnsEMBL::Registry; my $registry = 'Bio::EnsEMBL::Registry'; $registry->load_registry_from_db( -host => 'ensembldb.ensembl.org', # alternatively 'useastdb.ensembl.org' -user => 'anonymous'
);
my $gene_adaptor  = $registry->get_adaptor( 'Human', 'Core', 'Gene' );
my $gene = $gene_adaptor->fetch_by_stable_id('ENSG00000089234');
foreach my $transcript (@{$gene->get_all_Transcripts()) { print STDOUT 'Length of ', $transcript->display_id, ': ', $transcript->length, "\n";
} Regards
Thibaut On 31/07/12 12:49, Jay Humphrey wrote:
>Length is end - start + 1. >1 2 3 [4 5 6 7 8] 9 >start = 4, end = 8 >8 - 4 = 4, actually there are 5 residues. >>On 31/07/2012 10:39, ?? wrote: >>Hi All >>I wondering how to calculate transcript length within Ensembl database. >>I try to sum exon's length: >>>>SELECT tp.stable_id, SUM( e.seq_region_end ) - SUM( e.seq_region_start ) >>FROM gene g >>JOIN transcript tp ON ( g.gene_id = tp.gene_id ) >>JOIN exon_transcript et ON ( et.transcript_id = tp.transcript_id ) >>JOIN exon e ON ( e.exon_id = et.exon_id ) >>WHERE g.stable_id = 'ENSG00000089234' >>GROUP BY tp.stable_id >>>>But the result is inconsistent with Ensembl official data: >>http://asia.ensembl.org/Homo_sapiens/Gene/Summary?g=ENSG00000089234;r=12:112080797-112123790 >>>>If you know how to dig out the datas of >>variation,orthologue,paralogue,regulation. please also tell me. >>>>>>Thanks million >>-- >>Gang Chen >>TILSI >>Taicang Institute For Life Science Information >>Address: A2/162, Renmin
 South Road, Taicang, 215400, Jiangsu >>Province, P.R.China >>Phone: (+86)512-82782588 >>>>>>>>_______________________________________________ >>Dev mailing listDev at ensembl.org >>List admin (including subscribe/unsubscribe):http://lists.ensembl.org/mailman/listinfo/dev >>Ensembl Blog:http://www.ensembl.info/ >>-- >Jay Humphrey                   Ensembl Genomes Web Developer >EMBL-EBI                       Tel: +44-(0)1223-492682 >Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468 >Cambridge CB10 1SD, UKhttp://www.ensemblgenomes.org/ >>>_______________________________________________ >Dev mailing listDev at ensembl.org >List admin (including subscribe/unsubscribe):http://lists.ensembl.org/mailman/listinfo/dev >Ensembl Blog:http://www.ensembl.info/  -------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ensembl.org/pipermail/dev/attachments/20120801/d347b3bb/attachment.htm>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20120827/72ecf775/attachment.html>


More information about the Dev mailing list