[ensembl-dev] retrieving sequence data from previous assemblies using fetch_by_region method
Andy Yates
ayates at ebi.ac.uk
Tue Oct 7 08:53:30 BST 2014
Hi Duarte,
You can access GRCh37 from ensembldb.ensembl.org port number 3337. Ensembl databases currently only hold the contigs and assembly for a single assembly. That's why when you try to get sequence for GRCh37 in the GRCh38 database you get N's back
Hope this helps,
Andy
------------
Andrew Yates - Ensembl Support Coordinator
European Molecular Biology Laboratory
European Bioinformatics Institute
Wellcome Trust Genome Campus
Hinxton, Cambridge
CB10 1SD, United Kingdom
Tel: +44-(0)1223-492538
Fax: +44-(0)1223-494468
Skype: andrewyatz
http://www.ensembl.org/
On 6 Oct 2014, at 17:23, Duarte Molha <duartemolha at gmail.com> wrote:
> Dear developers
>
> I have the latest API downloaded and I would want to create a scritp that could retrieve sequence information from a specified assembly
>
> so I have made a script that tries to accomplish this:
>
> so ...assuming this coordinates:
>
> chr: 5
> from: 112043202
> to: 112046226
> strand = 1
> assembly = GRCh38
>
> my $slice = $slice_adaptor->fetch_by_region( 'chromosome', $chrom, $from, $to, $strand, $assembly );
>
>
> $seq = $slice->seq();
>
> I can retrieve the dna sequence:
>
> >CHR5-112043202-112046226 chr5:112043202-112046226
> AGTATATAATCACAT..............CTAAAAGCAAACA
>
>
> However, if I give it the variables:
>
> chr: 5
> from: 112043202
> to: 112046226
> strand = 1
> assembly = GRCh37 or NCBI36
>
> i get :
>
> >CHR5-112043202-112046226 chr5:112043202-112046226
> NNNNNNNNN.........NNNNNNNNNNNNNNNN
>
>
> How can I get the correct underlying sequence?
>
> Best regards
>
> Duarte
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
More information about the Dev
mailing list