[ensembl-dev] get sequence from different build

Javier Herrero jherrero at ebi.ac.uk
Thu Sep 16 16:27:53 BST 2010


Hi Stef

You can use 
$slice_adaptor->fetch_by_region( 'chromosome', '11', 123000000, 123000020, 1, 
'NCBI36' );

$slice_adaptor->fetch_by_region( 'chromosome', '11', 123000000, 123000020, 1, 
'GRCh37' );


But probably what you want is to map between coordinates. Here is an example:

my $slice = $sa->fetch_by_region( "chromosome", 11, 123000000, 123000020, 1, 
"NCBI36");
print $slice->name, "\n";
my $proj_segments = $slice->project("chromosome", "GRCh37");
foreach my $proj_segment (@$proj_segments) {
  print $proj_segment->to_Slice->name, "\n";
}

And the same, mapping in the other direction.

my $slice = $sa->fetch_by_region( "chromosome", 11, 123000000, 123000020, 1, 
"GRCh37");
print $slice->name, "\n";
my $proj_segments = $slice->project("chromosome", "NCBI36");
foreach my $proj_segment (@$proj_segments) {
  print $proj_segment->to_Slice->name, "\n";
}

You can read more about this in the perdoc for the
Bio::EnsEMBL::Slice->project method.

Javier

On Thursday 16 Sep 2010 15:00:25 mailsvl at fastmail.fm wrote:
> Ok, so this means you cannot get the sequences of different builds (for
> the same chromosomal coordinates) within one script? Or is this possible
> by changing the db version temporarily (with eg reset_DBAdaptor...)?
> 
> -Stef
> 
> 
> ----- Original message -----
> From: "Jan-hinnerk Vogel" <jhv at sanger.ac.uk>
> To: mailsvl at fastmail.fm
> Cc: "Ensembl DevList" <dev at ensembl.org>
> Date: Thu, 16 Sep 2010 14:30:46 +0100
> Subject: Re: [ensembl-dev] get sequence from different build
> 
> Hi Stef,
> 
> ooops - it seems the tutorial is wrong, sorry. We'll correct it asap.
> 
> What happend is that the 'strand' argument, which should be the 5th
> argument, slipped somehow.
> The assembly version should be the 6th argument, not the 5th. A good
> way to get warned about
> those things is to  use perl -w or use 'use warnings' in your script -
> you see something like
> 
> "Argument "GRCh37" isn't numeric in integer multiplication (*) at /nfs/
> ensembl/jhv//cvs_checkout//ensembl_live/modules/Bio/EnsEMBL/Mapper.pm
> line 314." so something is wrong ...
> 
> 
> http://www.ensembl.org/info/docs/Pdoc/ensembl/modules/Bio/EnsEMBL/DBSQL/Sli
> ceAdaptor.html#POD15
> 
> 
> You currently can't fetch sequence data from a different assembly
> version from the same database  - only the mapping is stored,
> not the sequence of older assemblies. If you try it, you should get a
> string of NNNNN's back.
> 
> To fetch sequence for GRCh37 use the latest human core :
> homo_sapiens_core_59_37d
> to fetch seq for NCBI36 you have to use an older database and connect
> with an older API checkout - ie homo_sapiens_core_***_36*
> 
> 
> Hth,
>             Jan-Hinnerk Vogel
> 
> 
> 
> ==
> Ensembl Genebuild Team Project Leader
> www.ensembl.org
> 
> On 16 Sep 2010, at 12:54, mailsvl at fastmail.fm wrote:
> > Hi,
> > 
> > How can I get a slice from a different build then the default? When I
> > try to compare GCRh37 and NCBI36 I keep getting the same sequences
> > (from
> > GCRh37), no matter what region I try.
> 
> I assume you mean i "GRCh37" ?
> 
> > The code I use is from:
> > http://www.ensembl.org/info/docs/api/core/core_tutorial.html
> > 
> > Code:
> > - $slice_adaptor->fetch_by_region( 'chromosome', '11', 123000000,
> > 123000020, 'NCBI36' ); or
> > - $slice_adaptor->fetch_by_region( 'chromosome', '11', 123000000,
> > 123000020, 1, 'NCBI36' );
> > 
> > Results when using ensembl website:
> >> 11 dna:chromosome chromosome:NCBI36:11:123000000:123000020:1
> > 
> > AAGAGTCCCACCAGCTCCAGG
> > 
> >> 11 dna:chromosome chromosome:GRCh37:11:123000000:123000020:1
> > 
> > TGCACTCCAGCCTGGGCAATG
> > 
> > Thanks,
> > Stef
> > __________________
> > http://www.fastmail.fm
> > 
> > 
> > _______________________________________________
> > Dev mailing list
> > Dev at ensembl.org
> > http://lists.ensembl.org/mailman/listinfo/dev
> 
> __________________
> http://www.fastmail.fm
> 
> 
> _______________________________________________
> Dev mailing list
> Dev at ensembl.org
> http://lists.ensembl.org/mailman/listinfo/dev

-- 
Javier Herrero, PhD
Ensembl Compara Project Leader
European Bioinformatics Institute (EMBL-EBI)
Wellcome Trust Genome Campus, Hinxton
Cambridge - CB10 1SD - UK




More information about the Dev mailing list