[ensembl-dev] Question regarding UTR retrieval from database

Duarte Molha duartemolha at gmail.com
Thu Apr 16 21:38:52 BST 2015


Actualy do have an undocumented method called public
Bio::EnsEMBL::Utils::IO::GTFSerializer::get_all_UTR_features()

What is that all about ?

=========================
     Duarte Miguel Paulo Molha
         http://about.me/duarte
=========================

On 16 April 2015 at 21:35, Duarte Molha <duartemolha at gmail.com> wrote:

> Yes... I knew of that method... had just forgot it. I still think the
> reverse of it would be useful on it own. get_all_untranslateable_Exons.
>
>
> =========================
>      Duarte Miguel Paulo Molha
>          http://about.me/duarte
> =========================
>
> On 16 April 2015 at 21:29, <mr6 at ebi.ac.uk> wrote:
>
>> Hi Duarte,
>>
>> You might find the get_all_translateable_Exons method useful.
>>
>> http://www.ensembl.org/info/docs/Doxygen/core-api/classBio_1_1EnsEMBL_1_1Transcript.html#a17e718ddd3d054de7b358029e6d48d20
>>
>> This would correspond to the get_coding_regions you are looking for, as
>> the exons returned are truncated to their coding region.
>> For the get_noncoding_regions however, you would need to look at all the
>> features from get_all_Exons that are not in get_all_translateable_Exons.
>>
>>
>> Hope that helps,
>> Magali
>>
>> > Thanks Kieron
>> >
>> > I understand your point of view... but I still think there is a case
>> for a
>> > couple of methods to be implemented in the transcript object:
>> > @{$transcript->get_coding_regions} and
>> > @{$transcript->get_noncoding_regions}
>> >
>> > Both returning feature objects. Am I the only one to find these useful?
>> I
>> > hope not :)
>> >
>> > Thanks
>> >
>> > Duarte
>> >
>> >
>> >
>> > =========================
>> >      Duarte Miguel Paulo Molha
>> >          http://about.me/duarte
>> > =========================
>> >
>> > On 16 April 2015 at 16:19, Kieron Taylor <ktaylor at ebi.ac.uk> wrote:
>> >
>> >> Hi Duarte,
>> >>
>> >> The coordinates you’re getting back are pre-splicing. The method you’re
>> >> calling is from the Transcript class, hence the response is with
>> >> reference
>> >> to that object. If you’re after exon coordinates, you should be
>> >> attempting
>> >> to work with exon objects, such as fetching the exons of the transcript
>> >> and
>> >> asking them for coding_region_start($transcript) until numbers start
>> >> appearing. Your workaround is also a valid approach.
>> >>
>> >> My explanation isn’t very satisfactory, but we try to avoid writing
>> >> methods that need complex return types, such as the list of lists
>> >> required
>> >> for your usecase. More often than not, users require other attributes
>> of
>> >> the objects too, so you would still end up with a list of exons. I hope
>> >> that helps.
>> >>
>> >> Regards,
>> >>
>> >> Kieron
>> >>
>> >>
>> >> Kieron Taylor PhD.
>> >> Ensembl Core senior software developer
>> >>
>> >> EMBL, European Bioinformatics Institute
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> > On 16 Apr 2015, at 09:15, Duarte Molha <duartemolha at gmail.com>
>> wrote:
>> >> >
>> >> > Anyone able to provide me some help on this?
>> >> >
>> >> > I have now found away around this issue by finding the exonic regions
>> >> within the reported URT, but would very much like to understand the
>> >> thinking behind this.
>> >> >
>> >> > Best regards
>> >> >
>> >> > Duarte
>> >> >
>> >> >
>> >> > =========================
>> >> >      Duarte Miguel Paulo Molha
>> >> >          http://about.me/duarte
>> >> > =========================
>> >> >
>> >> > On 14 April 2015 at 13:40, Duarte Molha <duartemolha at gmail.com>
>> wrote:
>> >> > Dear Developers
>> >> >
>> >> > Please consider the transcript :
>> >> >
>> >> > ENST00000470357
>> >> >
>> >> > I am trying to retrieve the coordinates of UTR regions of this
>> >> transcript
>> >> > To this end I have a script that basicaly starts with the transcript
>> >> feature object $transcript
>> >> >
>> >> > my $five_prime  = $transcript->five_prime_utr_Feature;
>> >> >
>> >> > $feature_params->{start} = $five_prime->start;
>> >> > $feature_params->{end}  = $five_prime->end;
>> >> >
>> >> > However, in this case the script will output the coordinates from the
>> >> start of the 1st non_coding exon to the end of the non-coding portion
>> of
>> >> the 3rd exon (chr1     7772707 7777171).
>> >> > How can I change this so that the script will only output the
>> >> coordinates of the non-coding exon portions?
>> >> >
>> >> > In this case I would like to output:
>> >> >
>> >> > chr1  7772707 7773198
>> >> > chr1  7773442 7773511
>> >> > chr1  7777160 7777171
>> >> >
>> >> > This there a simple way of achieving this?
>> >> >
>> >> > Many thanks
>> >> >
>> >> > Duarte
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > Dev mailing list    Dev at ensembl.org
>> >> > Posting guidelines and subscribe/unsubscribe info:
>> >> http://lists.ensembl.org/mailman/listinfo/dev
>> >> > Ensembl Blog: http://www.ensembl.info/
>> >>
>> >>
>> >> _______________________________________________
>> >> Dev mailing list    Dev at ensembl.org
>> >> Posting guidelines and subscribe/unsubscribe info:
>> >> http://lists.ensembl.org/mailman/listinfo/dev
>> >> Ensembl Blog: http://www.ensembl.info/
>> >>
>> > _______________________________________________
>> > Dev mailing list    Dev at ensembl.org
>> > Posting guidelines and subscribe/unsubscribe info:
>> > http://lists.ensembl.org/mailman/listinfo/dev
>> > Ensembl Blog: http://www.ensembl.info/
>> >
>>
>>
>>
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20150416/d7e5200d/attachment.html>


More information about the Dev mailing list