[ensembl-dev] how to calculate transcript length ?
enrico1970 at yahoo.com
enrico1970 at yahoo.com
Mon Aug 27 23:56:40 BST 2012
Dear Thibaut and Jay,
I really appreciate your suggestion to the list but I have a similar query to the one of Gang
At the page http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000089234;r=12:112080797-112123790;t=ENST00000419234
The transcript ENST00000419234 has a length of 3179 nucleotides, that corresponds to the sum of its exons, the protein is expected to have 3179/3=1059 amino acids
but it has only 592 amino acids, the same phenomenon happen for other trancripts.
What is the data definition of the length of the transcript and of the protein?
Kind regards,
Enrico Rubagotti
Hi,
I would recommend you to use the Perl API for all the information you
want to retrieve from Ensembl.
It's quite easy to use and if there is a schema change, you will not
have to change all your SQL queries.
Here is some documentation: http://www.ensembl.org/info/docs/api/core/core_tutorial.html http://www.ensembl.org/info/docs/Doxygen/core-api/index.html http://www.ensembl.org/info/docs/api/index.html Here is some code for your query: use Bio::EnsEMBL::Registry; my $registry = 'Bio::EnsEMBL::Registry'; $registry->load_registry_from_db( -host => 'ensembldb.ensembl.org', # alternatively 'useastdb.ensembl.org' -user => 'anonymous'
);
my $gene_adaptor = $registry->get_adaptor( 'Human', 'Core', 'Gene' );
my $gene = $gene_adaptor->fetch_by_stable_id('ENSG00000089234');
foreach my $transcript (@{$gene->get_all_Transcripts()) { print STDOUT 'Length of ', $transcript->display_id, ': ', $transcript->length, "\n";
} Regards
Thibaut On 31/07/12 12:49, Jay Humphrey wrote:
>Length is end - start + 1. >1 2 3 [4 5 6 7 8] 9 >start = 4, end = 8 >8 - 4 = 4, actually there are 5 residues. >>On 31/07/2012 10:39, ?? wrote: >>Hi All >>I wondering how to calculate transcript length within Ensembl database. >>I try to sum exon's length: >>>>SELECT tp.stable_id, SUM( e.seq_region_end ) - SUM( e.seq_region_start ) >>FROM gene g >>JOIN transcript tp ON ( g.gene_id = tp.gene_id ) >>JOIN exon_transcript et ON ( et.transcript_id = tp.transcript_id ) >>JOIN exon e ON ( e.exon_id = et.exon_id ) >>WHERE g.stable_id = 'ENSG00000089234' >>GROUP BY tp.stable_id >>>>But the result is inconsistent with Ensembl official data: >>http://asia.ensembl.org/Homo_sapiens/Gene/Summary?g=ENSG00000089234;r=12:112080797-112123790 >>>>If you know how to dig out the datas of >>variation,orthologue,paralogue,regulation. please also tell me. >>>>>>Thanks million >>-- >>Gang Chen >>TILSI >>Taicang Institute For Life Science Information >>Address: A2/162, Renmin
South Road, Taicang, 215400, Jiangsu >>Province, P.R.China >>Phone: (+86)512-82782588 >>>>>>>>_______________________________________________ >>Dev mailing listDev at ensembl.org >>List admin (including subscribe/unsubscribe):http://lists.ensembl.org/mailman/listinfo/dev >>Ensembl Blog:http://www.ensembl.info/ >>-- >Jay Humphrey Ensembl Genomes Web Developer >EMBL-EBI Tel: +44-(0)1223-492682 >Wellcome Trust Genome Campus Fax: +44-(0)1223-494468 >Cambridge CB10 1SD, UKhttp://www.ensemblgenomes.org/ >>>_______________________________________________ >Dev mailing listDev at ensembl.org >List admin (including subscribe/unsubscribe):http://lists.ensembl.org/mailman/listinfo/dev >Ensembl Blog:http://www.ensembl.info/ -------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ensembl.org/pipermail/dev/attachments/20120801/d347b3bb/attachment.htm>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20120827/72ecf775/attachment.html>
More information about the Dev
mailing list