[ensembl-dev] database versions and genome builds
Patrick Meidl
pmeidl at cemm.oeaw.ac.at
Fri Nov 19 09:50:02 GMT 2010
On Thu, Nov 18 2010, Andrea Edwards <edwardsa at cs.man.ac.uk> wrote:
> Is there some information a gene record that says something like
>
> 'the position of this gene is bases 100,000- 200,000 on chromosome 7
> but in the last database build it was actually at base 75,000 -
> 175,000 because it was mapped to a different genome build'
if you need this sort of information: there is a set of scripts which
generate a mapping between two assemblies and stores it in an Ensembl
database. you can then use the core API to project features from one
assembly to another.
look in ensembl/misc-scripts/assembly in the core API cvs checkout. the
README describes how to generate the mapping, whereas
EXAMPLE.use_mapping.pl describes how to project features between
assemblies once you have such a mapping.
the Ensembl core team used to generate these mappings for at least human
and mouse when a new assembly was released, so for these species you
could use things like $gene->project('<old_assembly_name>')
out-of-the-box. I don't know if this is still the case.
also note that I don't know if the scripts mentioned still work (I wrote
them several years ago and don't know if they are still maintained).
as an alternative, UCSC also has a program (called "liftover" or similar
IIRC) which such a projection of coordinates across assemblies.
HTH
patrick
--
Patrick Meidl, Mag.
Bioinformatician
Ce-M-M-
Research Centre for Molecular Medicine
of the Austrian Academy of Science
Lazarettgasse 14 / AKH BT 25.3
Vienna, Austria
room 02.205
phone +43 1 40160 70016
email pmeidl at cemm.oeaw.ac.at
web http://www.cemm.at/
More information about the Dev
mailing list