[ensembl-dev] Question regarding UTR retrieval from database

Kieron Taylor ktaylor at ebi.ac.uk
Thu Apr 16 16:19:30 BST 2015


Hi Duarte,

The coordinates you’re getting back are pre-splicing. The method you’re calling is from the Transcript class, hence the response is with reference to that object. If you’re after exon coordinates, you should be attempting to work with exon objects, such as fetching the exons of the transcript and asking them for coding_region_start($transcript) until numbers start appearing. Your workaround is also a valid approach.

My explanation isn’t very satisfactory, but we try to avoid writing methods that need complex return types, such as the list of lists required for your usecase. More often than not, users require other attributes of the objects too, so you would still end up with a list of exons. I hope that helps.

Regards,

Kieron


Kieron Taylor PhD.
Ensembl Core senior software developer

EMBL, European Bioinformatics Institute





> On 16 Apr 2015, at 09:15, Duarte Molha <duartemolha at gmail.com> wrote:
> 
> Anyone able to provide me some help on this? 
> 
> I have now found away around this issue by finding the exonic regions within the reported URT, but would very much like to understand the thinking behind this.
> 
> Best regards
> 
> Duarte
> 
> 
> =========================
>      Duarte Miguel Paulo Molha      
>          http://about.me/duarte         
> =========================
> 
> On 14 April 2015 at 13:40, Duarte Molha <duartemolha at gmail.com> wrote:
> Dear Developers
> 
> Please consider the transcript :
> 
> ENST00000470357
> 
> I am trying to retrieve the coordinates of UTR regions of this transcript
> To this end I have a script that basicaly starts with the transcript feature object $transcript
> 
> my $five_prime  = $transcript->five_prime_utr_Feature;
> 
> $feature_params->{start} = $five_prime->start;
> $feature_params->{end}  = $five_prime->end;
> 
> However, in this case the script will output the coordinates from the start of the 1st non_coding exon to the end of the non-coding portion of the 3rd exon (chr1	7772707	7777171).
> How can I change this so that the script will only output the coordinates of the non-coding exon portions?
> 
> In this case I would like to output:
> 
> chr1	7772707	7773198
> chr1	7773442	7773511
> chr1	7777160	7777171
> 
> This there a simple way of achieving this?
> 
> Many thanks
> 
> Duarte
> 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/





More information about the Dev mailing list