[ensembl-dev] Question regarding UTR retrieval from database

Duarte Molha duartemolha at gmail.com
Fri Apr 17 10:58:05 BST 2015


Well... that was probably the whole point to moving your development to
github no ?

Don't know if you are taking pull requests from contributions from the
general public, but if you aren't ... you should. ;-)

Thanks

Duarte

=========================
     Duarte Miguel Paulo Molha
         http://about.me/duarte
=========================

On 17 April 2015 at 09:51, Kieron Taylor <ktaylor at ebi.ac.uk> wrote:

> Hi Duarte,
>
> Given the namespace of the method you’ve found, I would strongly recommend
> caution. Bioinformatics formats are notoriously diverse and loosely
> specified, hence code to handle them is often somewhat bespoke. At any
> rate, that method is not helpful for your case.
>
> Thank you for bringing your needs to our attention, we can perhaps add
> support in future releases should time allow. Convenient methods are often
> missing from our API, due to lack of apparent need and a general shortage
> of developer time.
>
> Kieron
>
>
> Kieron Taylor PhD.
> Ensembl Core senior software developer
>
> EMBL, European Bioinformatics Institute
>
>
>
>
>
> > On 16 Apr 2015, at 21:38, Duarte Molha <duartemolha at gmail.com> wrote:
> >
> > Actualy do have an undocumented method called public
> Bio::EnsEMBL::Utils::IO::GTFSerializer::get_all_UTR_features()
> >
> > What is that all about ?
> >
> > =========================
> >      Duarte Miguel Paulo Molha
> >          http://about.me/duarte
> > =========================
> >
> > On 16 April 2015 at 21:35, Duarte Molha <duartemolha at gmail.com> wrote:
> > Yes... I knew of that method... had just forgot it. I still think the
> reverse of it would be useful on it own. get_all_untranslateable_Exons.
> >
> >
> > =========================
> >      Duarte Miguel Paulo Molha
> >          http://about.me/duarte
> > =========================
> >
> > On 16 April 2015 at 21:29, <mr6 at ebi.ac.uk> wrote:
> > Hi Duarte,
> >
> > You might find the get_all_translateable_Exons method useful.
> >
> http://www.ensembl.org/info/docs/Doxygen/core-api/classBio_1_1EnsEMBL_1_1Transcript.html#a17e718ddd3d054de7b358029e6d48d20
> >
> > This would correspond to the get_coding_regions you are looking for, as
> > the exons returned are truncated to their coding region.
> > For the get_noncoding_regions however, you would need to look at all the
> > features from get_all_Exons that are not in get_all_translateable_Exons.
> >
> >
> > Hope that helps,
> > Magali
> >
> > > Thanks Kieron
> > >
> > > I understand your point of view... but I still think there is a case
> for a
> > > couple of methods to be implemented in the transcript object:
> > > @{$transcript->get_coding_regions} and
> > > @{$transcript->get_noncoding_regions}
> > >
> > > Both returning feature objects. Am I the only one to find these
> useful? I
> > > hope not :)
> > >
> > > Thanks
> > >
> > > Duarte
> > >
> > >
> > >
> > > =========================
> > >      Duarte Miguel Paulo Molha
> > >          http://about.me/duarte
> > > =========================
> > >
> > > On 16 April 2015 at 16:19, Kieron Taylor <ktaylor at ebi.ac.uk> wrote:
> > >
> > >> Hi Duarte,
> > >>
> > >> The coordinates you’re getting back are pre-splicing. The method
> you’re
> > >> calling is from the Transcript class, hence the response is with
> > >> reference
> > >> to that object. If you’re after exon coordinates, you should be
> > >> attempting
> > >> to work with exon objects, such as fetching the exons of the
> transcript
> > >> and
> > >> asking them for coding_region_start($transcript) until numbers start
> > >> appearing. Your workaround is also a valid approach.
> > >>
> > >> My explanation isn’t very satisfactory, but we try to avoid writing
> > >> methods that need complex return types, such as the list of lists
> > >> required
> > >> for your usecase. More often than not, users require other attributes
> of
> > >> the objects too, so you would still end up with a list of exons. I
> hope
> > >> that helps.
> > >>
> > >> Regards,
> > >>
> > >> Kieron
> > >>
> > >>
> > >> Kieron Taylor PhD.
> > >> Ensembl Core senior software developer
> > >>
> > >> EMBL, European Bioinformatics Institute
> > >>
> > >>
> > >>
> > >>
> > >>
> > >> > On 16 Apr 2015, at 09:15, Duarte Molha <duartemolha at gmail.com>
> wrote:
> > >> >
> > >> > Anyone able to provide me some help on this?
> > >> >
> > >> > I have now found away around this issue by finding the exonic
> regions
> > >> within the reported URT, but would very much like to understand the
> > >> thinking behind this.
> > >> >
> > >> > Best regards
> > >> >
> > >> > Duarte
> > >> >
> > >> >
> > >> > =========================
> > >> >      Duarte Miguel Paulo Molha
> > >> >          http://about.me/duarte
> > >> > =========================
> > >> >
> > >> > On 14 April 2015 at 13:40, Duarte Molha <duartemolha at gmail.com>
> wrote:
> > >> > Dear Developers
> > >> >
> > >> > Please consider the transcript :
> > >> >
> > >> > ENST00000470357
> > >> >
> > >> > I am trying to retrieve the coordinates of UTR regions of this
> > >> transcript
> > >> > To this end I have a script that basicaly starts with the transcript
> > >> feature object $transcript
> > >> >
> > >> > my $five_prime  = $transcript->five_prime_utr_Feature;
> > >> >
> > >> > $feature_params->{start} = $five_prime->start;
> > >> > $feature_params->{end}  = $five_prime->end;
> > >> >
> > >> > However, in this case the script will output the coordinates from
> the
> > >> start of the 1st non_coding exon to the end of the non-coding portion
> of
> > >> the 3rd exon (chr1     7772707 7777171).
> > >> > How can I change this so that the script will only output the
> > >> coordinates of the non-coding exon portions?
> > >> >
> > >> > In this case I would like to output:
> > >> >
> > >> > chr1  7772707 7773198
> > >> > chr1  7773442 7773511
> > >> > chr1  7777160 7777171
> > >> >
> > >> > This there a simple way of achieving this?
> > >> >
> > >> > Many thanks
> > >> >
> > >> > Duarte
> > >> >
> > >> >
> > >> > _______________________________________________
> > >> > Dev mailing list    Dev at ensembl.org
> > >> > Posting guidelines and subscribe/unsubscribe info:
> > >> http://lists.ensembl.org/mailman/listinfo/dev
> > >> > Ensembl Blog: http://www.ensembl.info/
> > >>
> > >>
> > >> _______________________________________________
> > >> Dev mailing list    Dev at ensembl.org
> > >> Posting guidelines and subscribe/unsubscribe info:
> > >> http://lists.ensembl.org/mailman/listinfo/dev
> > >> Ensembl Blog: http://www.ensembl.info/
> > >>
> > > _______________________________________________
> > > Dev mailing list    Dev at ensembl.org
> > > Posting guidelines and subscribe/unsubscribe info:
> > > http://lists.ensembl.org/mailman/listinfo/dev
> > > Ensembl Blog: http://www.ensembl.info/
> > >
> >
> >
> >
> > _______________________________________________
> > Dev mailing list    Dev at ensembl.org
> > Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> > Ensembl Blog: http://www.ensembl.info/
> >
> >
> > _______________________________________________
> > Dev mailing list    Dev at ensembl.org
> > Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> > Ensembl Blog: http://www.ensembl.info/
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20150417/438e2f8f/attachment.html>


More information about the Dev mailing list