[ensembl-dev] question regarding refseq exons retreival

Duarte Molha duartemolha at gmail.com
Tue Mar 10 15:30:05 GMT 2015


Thanks Keiron

But this still leaves me with a question.

Say that I have a gene, and I retreive the correct gene object from the
ensembl database. How can I output only the transcripts that are referenced
in Refseq is not my the way I have done it?

If I go the normal way, the  $gene->get_all_Transcripts(); method will
retrieve all ensembl transcripts. How can I limit it to only get
transcripts that are refseq?

Thanks

Duarte

=========================
     Duarte Miguel Paulo Molha
         http://about.me/duarte
=========================

On 10 March 2015 at 15:22, Kieron Taylor <ktaylor at ebi.ac.uk> wrote:

> Dear Duarte,
>
> The issue you have exposed is subtle. You seem to be printing “exon stable
> IDs” but expecting them to be RefSeq accessions. Our mistake was to use the
> RefSeq IDs as arbitrary identifiers for internal use, but I must stress the
> what Ensembl calls a Stable ID must never be assumed to have any meaning
> outside of an Ensembl database. What you want are display labels. The exon
> labels were generated by picking only the first of any possible RefSeq IDs,
> hence you cannot get everything you want in this way.
>
> The correct way to handle this in your code is to fetch the transcript
> name and print that in each exon, as RefSeq IDs refer to transcripts and
> not exons.
>
>
> Regards,
>
> Kieron
>
>
> Kieron Taylor PhD.
> Ensembl Core senior software developer
>
> EMBL, European Bioinformatics Institute
>
>
>
>
>
> > On 10 Mar 2015, at 11:57, Duarte Molha <duartemolha at gmail.com> wrote:
> >
> > Dear developers
> >
> > I have a script that I wrote (in attachment)  that gets me the refseq
> exons for give input gene
> >
> > However when I use this code using the gene ASXL1 as an example is:
> >
> > test_query.pl ASXL1
> >
> > QueryName     feature_type    common_name     Biotype id      chr
>  start   end     strand
> > ASXL1 Exon    ASXL1   protein_coding  NM_001164603.1.1        chr20
>  30946147        30946635        +
> > ASXL1 Exon    ASXL1   protein_coding  NM_001164603.1.2        chr20
>  30954187        30954269        +
> > ASXL1 Exon    ASXL1   protein_coding  NM_001164603.1.3        chr20
>  30955530        30955532        +
> > ASXL1 Exon    ASXL1   protein_coding  NM_001164603.1.4        chr20
>  30956818        30956926        +
> > ASXL1 Exon    ASXL1   protein_coding  NM_015338.5.5   chr20   31015931
>       31016051        +
> > ASXL1 Exon    ASXL1   protein_coding  NM_015338.5.6   chr20   31016128
>       31016225        +
> > ASXL1 Exon    ASXL1   protein_coding  NM_015338.5.7   chr20   31017141
>       31017234        +
> > ASXL1 Exon    ASXL1   protein_coding  NM_015338.5.8   chr20   31017704
>       31017856        +
> > ASXL1 Exon    ASXL1   protein_coding  NM_015338.5.9   chr20   31019124
>       31019287        +
> > ASXL1 Exon    ASXL1   protein_coding  NM_015338.5.10  chr20   31019386
>       31019482        +
> > ASXL1 Exon    ASXL1   protein_coding  NM_015338.5.11  chr20   31020683
>       31020788        +
> > ASXL1 Exon    ASXL1   protein_coding  NM_015338.5.12  chr20   31021087
>       31021720        +
> > ASXL1 Exon    ASXL1   protein_coding  NM_015338.5.13  chr20   31022235
>       31027122        +
> >
> >
> > As you can see, I am missing some of the exons for transcript NM_015338.5
> > In this case, the 1st 3 exons of transcript  NM_015338.5 are identical
> to NM_001164603.1, but I would expect to have them listed as :
> >
> > ASXL1 Exon    ASXL1   protein_coding  NM_015338.5.1   chr20   30946147
>       30946635        +
> > ASXL1 Exon    ASXL1   protein_coding  NM_015338.5.2   chr20   30954187
>       30954269        +
> > ASXL1 Exon    ASXL1   protein_coding  NM_015338.5.3   chr20   30955530
>       30955532        +
> >
> > Can you tell me what is wrong with my approach and how I can retrieve
> the missing data?
> >
> > Best regards
> >
> > Duarte
> > <test_query.pl>_______________________________________________
> > Dev mailing list    Dev at ensembl.org
> > Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> > Ensembl Blog: http://www.ensembl.info/
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20150310/b5782067/attachment.html>


More information about the Dev mailing list