[ensembl-dev] get sequence from different build
ian Longden
ianl at ebi.ac.uk
Fri Sep 24 15:28:25 BST 2010
Ah yes i did my test on another region where there was sequence:-
here is the full code and example:-
-----------------------------------------------------------------------------------------------------
use Bio::EnsEMBL::Registry;
my $s = 'Human'; # species-name
my $r = 'chromosome'; # slice region
my $c = 11; # chromosome
#my $p = 123000000; # position
my $p = 60001;
# ======================================================
my $registry = 'Bio::EnsEMBL::Registry';
$registry->load_registry_from_db(
-host => 'ensembldb.ensembl.org',
-user => 'anonymous'
);
my $sa = $registry->get_adaptor( $s, 'Core', 'Slice' );
# ======================================================
my $s36 = $sa->fetch_by_region( $r, $c, $p, $p+20, 1, 'NCBI36' );
my $s37 = $sa->fetch_by_region( $r, $c, $p, $p+20, 1, 'GRCh37' );
print "Seq for 37:-\n";
print $s37->seq."\n";
my $chr_projection = $s36->project('Chromosome','GRCh37');
my $seq = "";
foreach my $segment (@$chr_projection) {
my ($start, $end, $chr) = @$segment;
$seq .= $chr->seq;
}
print "Seq for 36:-\n";
print $seq."\n";
------------------------------------------------------------------------------------------------
Giving:-
Seq for 37:-
GAATTCTACATTAGAAAAATA
Seq for 36:-
AGGCAGAGGTCAAAGTGAGCC
Cheers,
Ian.
On Fri, Sep 24, 2010 at 3:17 PM, Hiram Clawson <hiram at soe.ucsc.edu> wrote:
> The first 10,000 bases of GRCh37 is all "N" - telomere gap.
> It is actual sequence in build 36, when translated to GRCh37
> starts at position 10,001
>
> --Hiram
>
> On Thu, Sep 23, 2010 at 9:58 AM, <mailsvl at fastmail.fm> wrote:
>> Hi Javier,
>>
>> What I want is your first example, to get for a region (eg
>> chr1:1230-1240):
>> 1) the seq of build 36
>> 2) the seq of build 37
>>
>> But the code below always gives me 'NNNNNN' for the 36 build, try this:
>>
>> # ======================================================
>> my $s = 'Human'; # species-name
>> my $r = 'chromosome'; # slice region
>> my $c = 11; # chromosome
>> my $p = 123000000; # position
>>
>> # ======================================================
>> my $registry = 'Bio::EnsEMBL::Registry';
>> $registry->load_registry_from_db(
>> -host => 'ensembldb.ensembl.org',
>> -user => 'anonymous'
>> );
>> my $sa = $registry->get_adaptor( $s, 'Core', 'Slice' );
>>
>> # ======================================================
>> my $s36 = $sa->fetch_by_region( $r, $c, $p, $p+20, 1, 'NCBI36' );
>> my $s37 = $sa->fetch_by_region( $r, $c, $p, $p+20, 1, 'GRCh37' );
>> print $s36->seq."\n";
>> print $s37->seq."\n";
>>
>> /code
>>
>> Results in:
>>> NNNNNNNNNNNNNNNNNNNNN
>>> TGCACTCCAGCCTGGGCAATG
>>
>> Using version ensembl version 59, 58 or 57 all failed...
>>
>> -Stef
>
> _______________________________________________
> Dev mailing list
> Dev at ensembl.org
> http://lists.ensembl.org/mailman/listinfo/dev
>
More information about the Dev
mailing list