[ensembl-dev] retrieving sequence data from previous assemblies using fetch_by_region method

Duarte Molha duartemolha at gmail.com
Tue Oct 7 11:13:03 BST 2014


Any chance you might consider implementing this?
I think it would be very useful to be able to retrieve the underlying
sequence in previous assemblies since the transform method can already give
us the coordinates of any given feature on a older assembly.


=========================
     Duarte Miguel Paulo Molha
         http://about.me/duarte
=========================

On Tue, Oct 7, 2014 at 8:53 AM, Andy Yates <ayates at ebi.ac.uk> wrote:

> Hi Duarte,
>
> You can access GRCh37 from ensembldb.ensembl.org port number 3337.
> Ensembl databases currently only hold the contigs and assembly for a single
> assembly. That's why when you try to get sequence for GRCh37 in the GRCh38
> database you get N's back
>
> Hope this helps,
>
> Andy
>
> ------------
> Andrew Yates - Ensembl Support Coordinator
> European Molecular Biology Laboratory
> European Bioinformatics Institute
> Wellcome Trust Genome Campus
> Hinxton, Cambridge
> CB10 1SD, United Kingdom
> Tel: +44-(0)1223-492538
> Fax: +44-(0)1223-494468
> Skype: andrewyatz
> http://www.ensembl.org/
>
> On 6 Oct 2014, at 17:23, Duarte Molha <duartemolha at gmail.com> wrote:
>
> > Dear developers
> >
> > I have the latest API downloaded and I would want to create a scritp
> that could retrieve sequence information from a specified assembly
> >
> > so I have made a script that tries to accomplish this:
> >
> > so ...assuming this coordinates:
> >
> > chr: 5
> > from: 112043202
> > to: 112046226
> > strand = 1
> > assembly = GRCh38
> >
> >     my $slice = $slice_adaptor->fetch_by_region( 'chromosome', $chrom,
> $from, $to, $strand, $assembly );
> >
> >
> >         $seq        = $slice->seq();
> >
> > I can retrieve the dna sequence:
> >
> > >CHR5-112043202-112046226       chr5:112043202-112046226
> > AGTATATAATCACAT..............CTAAAAGCAAACA
> >
> >
> > However, if I give it the variables:
> >
> > chr: 5
> > from: 112043202
> > to: 112046226
> > strand = 1
> > assembly = GRCh37 or NCBI36
> >
> > i get :
> >
> > >CHR5-112043202-112046226       chr5:112043202-112046226
> > NNNNNNNNN.........NNNNNNNNNNNNNNNN
> >
> >
> > How can I get the correct underlying sequence?
> >
> > Best regards
> >
> > Duarte
> > _______________________________________________
> > Dev mailing list    Dev at ensembl.org
> > Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> > Ensembl Blog: http://www.ensembl.info/
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20141007/60fd3544/attachment.html>


More information about the Dev mailing list