[ensembl-dev] Perl BLOBs in the ensembl compara database

PATERSON Trevor trevor.paterson at roslin.ed.ac.uk
Fri Jan 21 14:28:32 GMT 2011


 I have been trying to get to grips with the Compara schema in order to think about writing Java libraries to access the data...
 
However it appears that some of the data in Compara is more intimately wedded to Perl than I had hoped!
 
Looking at Genomic Alignment data, the Conservation Score values (which I think can be a variable length array of floats) are stored as BLOBS, packed internal representations of Perl floats.... and therefore require Perl to unpack them.
 
quickly scanning through the schema I don't see any other fields of type BLOB. 
 
My understanding is that these values are probably dumped here using 'pack' as a quick 'hack' to avoid having to deal with variable length arrays.
 
Unfortunately it does, however, rather tie the data to Perl.

Is this a design decision  - or just a historical accident?
Are there (or will there be.. ) any other examples of Perl BLOBs in Ensembl?

 
cheers
 
Trevor
 
 

mysql> describe conservation_score;
+------------------------+----------------------+------+-----+---------+-------+
| Field                  | Type                 | Null | Key | Default | Extra |
+------------------------+----------------------+------+-----+---------+-------+
| genomic_align_block_id | bigint(20) unsigned  | NO   | MUL | NULL    |       |
| window_size            | smallint(5) unsigned | NO   |     | NULL    |       |
| position               | int(10) unsigned     | NO   |     | NULL    |       |
| expected_score         | blob                 | YES  |     | NULL    |       |
| diff_score             | blob                 | YES  |     | NULL    |       |
+------------------------+----------------------+------+-----+---------+-------+

 

Trevor Paterson PhD
email trevor.paterson at roslin.ed.ac.uk <mailto:trevor.paterson at roslin.ed.ac.uk> 

Bioinformatics 
The Roslin Institute
The Royal (Dick) School of Veterinary Studies
University of Edinburgh
Scotland EH25 9PS
phone +44 (0)131 5274197
http://bioinformatics.roslin.ed.ac.uk/ <http://bioinformatics.roslin.ed.ac.uk/> 

Please consider the environment before printing this e-mail

The University of Edinburgh is a charitable body, registered in Scotland with registration number SC005336
Disclaimer:This e-mail and any attachments are confidential and intended solely for the use of the recipient(s) to whom they are addressed. If you have received it in error, please destroy all copies and inform the sender. 


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.





More information about the Dev mailing list