[ensembl-dev] Question regarding UTR retrieval from database

Kieron Taylor ktaylor at ebi.ac.uk
Fri Apr 17 09:51:07 BST 2015


Hi Duarte,

Given the namespace of the method you’ve found, I would strongly recommend caution. Bioinformatics formats are notoriously diverse and loosely specified, hence code to handle them is often somewhat bespoke. At any rate, that method is not helpful for your case.

Thank you for bringing your needs to our attention, we can perhaps add support in future releases should time allow. Convenient methods are often missing from our API, due to lack of apparent need and a general shortage of developer time.

Kieron


Kieron Taylor PhD.
Ensembl Core senior software developer

EMBL, European Bioinformatics Institute





> On 16 Apr 2015, at 21:38, Duarte Molha <duartemolha at gmail.com> wrote:
> 
> Actualy do have an undocumented method called public Bio::EnsEMBL::Utils::IO::GTFSerializer::get_all_UTR_features() 
> 
> What is that all about ?
> 
> =========================
>      Duarte Miguel Paulo Molha      
>          http://about.me/duarte         
> =========================
> 
> On 16 April 2015 at 21:35, Duarte Molha <duartemolha at gmail.com> wrote:
> Yes... I knew of that method... had just forgot it. I still think the reverse of it would be useful on it own. get_all_untranslateable_Exons.
> 
> 
> =========================
>      Duarte Miguel Paulo Molha      
>          http://about.me/duarte         
> =========================
> 
> On 16 April 2015 at 21:29, <mr6 at ebi.ac.uk> wrote:
> Hi Duarte,
> 
> You might find the get_all_translateable_Exons method useful.
> http://www.ensembl.org/info/docs/Doxygen/core-api/classBio_1_1EnsEMBL_1_1Transcript.html#a17e718ddd3d054de7b358029e6d48d20
> 
> This would correspond to the get_coding_regions you are looking for, as
> the exons returned are truncated to their coding region.
> For the get_noncoding_regions however, you would need to look at all the
> features from get_all_Exons that are not in get_all_translateable_Exons.
> 
> 
> Hope that helps,
> Magali
> 
> > Thanks Kieron
> >
> > I understand your point of view... but I still think there is a case for a
> > couple of methods to be implemented in the transcript object:
> > @{$transcript->get_coding_regions} and
> > @{$transcript->get_noncoding_regions}
> >
> > Both returning feature objects. Am I the only one to find these useful? I
> > hope not :)
> >
> > Thanks
> >
> > Duarte
> >
> >
> >
> > =========================
> >      Duarte Miguel Paulo Molha
> >          http://about.me/duarte
> > =========================
> >
> > On 16 April 2015 at 16:19, Kieron Taylor <ktaylor at ebi.ac.uk> wrote:
> >
> >> Hi Duarte,
> >>
> >> The coordinates you’re getting back are pre-splicing. The method you’re
> >> calling is from the Transcript class, hence the response is with
> >> reference
> >> to that object. If you’re after exon coordinates, you should be
> >> attempting
> >> to work with exon objects, such as fetching the exons of the transcript
> >> and
> >> asking them for coding_region_start($transcript) until numbers start
> >> appearing. Your workaround is also a valid approach.
> >>
> >> My explanation isn’t very satisfactory, but we try to avoid writing
> >> methods that need complex return types, such as the list of lists
> >> required
> >> for your usecase. More often than not, users require other attributes of
> >> the objects too, so you would still end up with a list of exons. I hope
> >> that helps.
> >>
> >> Regards,
> >>
> >> Kieron
> >>
> >>
> >> Kieron Taylor PhD.
> >> Ensembl Core senior software developer
> >>
> >> EMBL, European Bioinformatics Institute
> >>
> >>
> >>
> >>
> >>
> >> > On 16 Apr 2015, at 09:15, Duarte Molha <duartemolha at gmail.com> wrote:
> >> >
> >> > Anyone able to provide me some help on this?
> >> >
> >> > I have now found away around this issue by finding the exonic regions
> >> within the reported URT, but would very much like to understand the
> >> thinking behind this.
> >> >
> >> > Best regards
> >> >
> >> > Duarte
> >> >
> >> >
> >> > =========================
> >> >      Duarte Miguel Paulo Molha
> >> >          http://about.me/duarte
> >> > =========================
> >> >
> >> > On 14 April 2015 at 13:40, Duarte Molha <duartemolha at gmail.com> wrote:
> >> > Dear Developers
> >> >
> >> > Please consider the transcript :
> >> >
> >> > ENST00000470357
> >> >
> >> > I am trying to retrieve the coordinates of UTR regions of this
> >> transcript
> >> > To this end I have a script that basicaly starts with the transcript
> >> feature object $transcript
> >> >
> >> > my $five_prime  = $transcript->five_prime_utr_Feature;
> >> >
> >> > $feature_params->{start} = $five_prime->start;
> >> > $feature_params->{end}  = $five_prime->end;
> >> >
> >> > However, in this case the script will output the coordinates from the
> >> start of the 1st non_coding exon to the end of the non-coding portion of
> >> the 3rd exon (chr1     7772707 7777171).
> >> > How can I change this so that the script will only output the
> >> coordinates of the non-coding exon portions?
> >> >
> >> > In this case I would like to output:
> >> >
> >> > chr1  7772707 7773198
> >> > chr1  7773442 7773511
> >> > chr1  7777160 7777171
> >> >
> >> > This there a simple way of achieving this?
> >> >
> >> > Many thanks
> >> >
> >> > Duarte
> >> >
> >> >
> >> > _______________________________________________
> >> > Dev mailing list    Dev at ensembl.org
> >> > Posting guidelines and subscribe/unsubscribe info:
> >> http://lists.ensembl.org/mailman/listinfo/dev
> >> > Ensembl Blog: http://www.ensembl.info/
> >>
> >>
> >> _______________________________________________
> >> Dev mailing list    Dev at ensembl.org
> >> Posting guidelines and subscribe/unsubscribe info:
> >> http://lists.ensembl.org/mailman/listinfo/dev
> >> Ensembl Blog: http://www.ensembl.info/
> >>
> > _______________________________________________
> > Dev mailing list    Dev at ensembl.org
> > Posting guidelines and subscribe/unsubscribe info:
> > http://lists.ensembl.org/mailman/listinfo/dev
> > Ensembl Blog: http://www.ensembl.info/
> >
> 
> 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
> 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/





More information about the Dev mailing list