[ensembl-dev] get sequence from different build

Hiram Clawson hiram at soe.ucsc.edu
Fri Sep 24 15:17:27 BST 2010


The first 10,000 bases of GRCh37 is all "N" - telomere gap.
It is actual sequence in build 36, when translated to GRCh37
starts at position 10,001

--Hiram

On Thu, Sep 23, 2010 at 9:58 AM,  <mailsvl at fastmail.fm> wrote:
> Hi Javier,
>
> What I want is your first example, to get for a region (eg
> chr1:1230-1240):
> 1) the seq of build 36
> 2) the seq of build 37
>
> But the code below always gives me 'NNNNNN' for the 36 build, try this:
>
> # ======================================================
> my $s = 'Human';      # species-name
> my $r = 'chromosome'; # slice region
> my $c = 11;           # chromosome
> my $p = 123000000;    # position
>
> # ======================================================
> my $registry = 'Bio::EnsEMBL::Registry';
> $registry->load_registry_from_db(
>  -host => 'ensembldb.ensembl.org',
>  -user => 'anonymous'
> );
> my $sa = $registry->get_adaptor( $s, 'Core', 'Slice' );
>
> # ======================================================
> my $s36 = $sa->fetch_by_region( $r, $c, $p, $p+20, 1, 'NCBI36' );
> my $s37 = $sa->fetch_by_region( $r, $c, $p, $p+20, 1, 'GRCh37' );
> print $s36->seq."\n";
> print $s37->seq."\n";
>
> /code
>
> Results in:
>> NNNNNNNNNNNNNNNNNNNNN
>> TGCACTCCAGCCTGGGCAATG
>
> Using version ensembl version 59, 58 or 57 all failed...
>
> -Stef




More information about the Dev mailing list