[ensembl-dev] Coordinate System anomalies in EnsemblGenomes

PATERSON Trevor trevor.paterson at roslin.ed.ac.uk
Fri Mar 1 16:52:49 GMT 2013


I am a little concerned that something may have gone awry with the pipeline for the bacterial genome assemblies in release 17.

(but not concerned enough to want an instant answer on a Friday afternoon ;)

previously in the 'coord_system' table there always seemed to be a CS with rank=1 for each species
(this is the 'top_level', the documentation on  http://www.ensembl.org/info/docs/api/core/core_schema.html#coord_system
still says that this the 'top_level' also be identified by the attribute  value "top_level" - but as far as I am aware this is never the case)

there are now species in the bacterial databases that don't have a CS with rank  = 1
(e.g. see 'Bacillus thuringiensis IBL200'  which is species_id = '37 'in the database 'bacteria_21_collection_core_17_70_1'
this has two CS, rank 3 = supercontig, rank 4 = contig (sequence_level)
so in this case rank 3 has to be treated as 'top_level' by the api)

However if we compare releases 16 and 17

For the species 'Bacillus thuringiensis IBL200'   (with the same assembly accession and name  in versions 16 and 17 - but a different assembly date)
in release 16 there IS a chromosome level CS, with rank 1, in the database 'bacillus_collection_core_16_69_4' ( species_id='27')
rank 3 = contig (sequence_level), rank 1 = chromosome

the current website for this species shows the supercontig fragments
http://bacteria.ensembl.org/bacillus_thuringiensis_ibl_200/Location/Genome
and i am guessing that the previous release would have shown the chromosome assembly available through the release 16 database

so I am concerned that the pipeline has failed to generate or to copy over the top level, rank 1, chromosome_level assembly in release 17 for at least this one species
(otherwise the quality of the assembly must have been 'demoted' from chromosomal to supercontig)

there are several other species lacking a top level/chromosome_level/rank 1 CS in release17,
but obviously this may represent the real state of these assemblies,
without a detailed comparison with release 16 I can't say whether the situation is similar to the above

cheers

Trevor






Trevor Paterson PhD
trevor.paterson at roslin.ed.ac.uk<mailto:trevor.paterson at roslin.ed.ac.uk>
Bioinformatics
The Roslin Institute
Royal (Dick) School of Veterinary Studies
University of Edinburgh
Easter Bush
Midlothian
EH25 9RG
Scotland UK

phone +44 (0)131 651 9157

http://bioinformatics.roslin.ed.ac.uk/

Please consider the environment before printing this e-mail
The University of Edinburgh is a charitable body, registered in Scotland with registration number SC005336
Disclaimer:This e-mail and any attachments are confidential and intended solely for the use of the recipient(s) to whom they are addressed. If you have received it in error, please destroy all copies and inform the sender.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20130301/73acec2f/attachment.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20130301/73acec2f/attachment.ksh>


More information about the Dev mailing list