[ensembl-dev] changing coordinate from assembly versions

mag mr6 at ebi.ac.uk
Thu Nov 14 14:34:11 GMT 2013


Hi Nathalie,

To do this, you will probably need to use a recent ensembl database that 
already contains the latest assembl (GRCm38).

In the coord_system table, you can see all the coordinate systems 
available within the database
+-----------------+------------+-------------+---------+------+--------------------------------+
| coord_system_id | species_id | name        | version | rank | 
attrib                         |
+-----------------+------------+-------------+---------+------+--------------------------------+
|               1 |          1 | contig      | NULL    |    3 | 
default_version,sequence_level |
|               2 |          1 | scaffold    | GRCm38  |    2 | 
default_version                |
|               3 |          1 | chromosome  | GRCm38  |    1 | 
default_version                |
|              11 |          1 | chromosome  | NCBIM37 |    4 
|                                |
|              12 |          1 | supercontig | NCBIM37 |    5 
|                                |
|             111 |          1 | chromosome  | NCBIM36 |    6 
|                                |
+-----------------+------------+-------------+---------+------+--------------------------------+

The transform() method should then allow you to convert features between 
coordinate system.

For example, given a gene in old mm37 (GRCm37) coordinates $gene.
my $new_gene = $gene->transform('chromosome', 'GRCm38');
should return a new gene object in GRCm38 coordinates, provided there is 
a mapping available between the two sets of coordinates.

Two things to bear in mind:
- the gene to transform needs to be in a database that contains both 
assemblies and a mapping between them
This means you will probably need to copy your old set of genes into a 
more recent database

- there is not always a one-to-one mapping available between different 
regions of the genome
if your gene spans a region that has changed massively between the two 
assemblies and only fragmented bits could be map, the gene will not 
transform


Hope that helps,
mag


On 14/11/2013 13:24, Nathalie Conte wrote:
> HI,
> I have a list of coordinates (chromosome:start:end) from a previous assembly (mm37) and want to translate this into the current assembly. What is the best method to use in the perl API please?
> thanks
> Nathalie
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/





More information about the Dev mailing list