[ensembl-dev] Liftover script from NCBI36 to GRCh37
mag
mr6 at ebi.ac.uk
Thu Sep 18 15:03:43 BST 2014
Hi Chris,
In our latest Ensembl release, the main human assembly is GRCh38.
This means that we provide mappings between GRCh38 and most previous
assemblies (GRCh37, NCBI36, NCBI35, NCBI34)
However, we do not provide mappings between the older assemblies.
If you want mappings between GRCh37 and non-GRCh38 assemblies, we
recommend using the GRCh37 databases.
These are available in release 75, so using the API version 75 would
point you to the right databases.
Hope that helps,
Magali
On 17/09/2014 16:50, Chris Penkett wrote:
>
> Hi Ensembl developers,
>
> I've been using a script (see below) with the Ensembl API to do
> liftover of coordinates between different genome versions, but it no
> longer seems to work for moving NCBI36 to GRCh37. Is it still
> possible to do this conversion with the new genome reference?
>
> Output of script for different genome asssembly liftovers:
>
> % ./ens_db_lift.pl NCBI36 GRCh37 chr1:10000-20000
> Can't call method "map" on an undefined value at ./ens_db_lift.pl line
> 42.
>
> % ./ens_db_lift.pl NCBI36 GRCh38 chr1:10000-20000
> 1, 10000, 20000 --> 1 20137 30137
>
> % ./ens_db_lift.pl GRCh37 GRCh38 chr1:10000-20000
> 1, 10000, 20000 --> 1 10001 20000
>
>
> #!/usr/bin/perl
>
> use strict;
> use warnings;
> use Data::Dumper;
>
> use lib "/src/ensembl/bioperl-1.2.3";
> use lib "/home/cjp64/src/ensembl/ensembl/modules";
>
> use Bio::EnsEMBL::DBSQL::DBAdaptor;
> use Bio::EnsEMBL::AssemblyMapper;
> use Bio::EnsEMBL::Mapper::Coordinate;
> use Bio::EnsEMBL::DBSQL::SliceAdaptor;
>
> my $from = shift;
> my $to = shift;
> my $coord = shift;
>
> my $species = "human";
> my $host = 'ensembldb.ensembl.org';
> my $user = 'anonymous';
>
> my $reg = 'Bio::EnsEMBL::Registry';
> $reg->load_registry_from_db(-host => $host, -user => $user);
>
> my $asma = $reg->get_adaptor($species, 'core', 'AssemblyMapper');
> my $csa = $reg->get_adaptor($species, 'core', 'CoordSystem');
> my $sa = $reg->get_adaptor($species, 'core', 'Slice');
>
> my $from_cs = $csa->fetch_by_name('chromosome', $from );
> die "Unknown coord system: $from\n" if not $from_cs;
> my $to_cs = $csa->fetch_by_name('chromosome', $to);
> die "Unknown coord system: $to\n" if not $to_cs;
>
> my $mapper = $asma->fetch_by_CoordSystems($from_cs, $to_cs);
>
> $coord =~ /(.*?):(\d+)-(\d+)/;
>
> my ($chr, $start, $end) = ($1, $2, $3);
> $chr =~ s/chr//;
>
> my @res = $mapper->map($chr, $start, $end, 1, $from_cs);
> foreach my $res ( @res ) {
> if ($res->isa( 'Bio::EnsEMBL::Mapper::Coordinate') ) {
> my $chr_slice = $sa->fetch_by_seq_region_id($res->id);
> print "$chr, $start, $end --> " . join("\t",
> $chr_slice->seq_region_name, $res->start, $res->end) . "\n";
> }
> }
>
> Best wishes,
> Chris
>
More information about the Dev
mailing list