[ensembl-dev] homo_sapiens_variation_62_37g

Graham Ritchie grsr at ebi.ac.uk
Fri May 13 10:13:13 BST 2011


Hi Stuart,

The transcript_variation table was completely redesigned for this release, now each row really represents a single allele of a TranscriptVariation, and so we took the opportunity to rename misleadingly named columns like consequence_type which can contain more than one type. The change to feature_stable_id stemmed from a development version where we stored consequences on features other than transcripts in this table, but you are correct that the pep_allele_string column could have retained it's previous name.

We made a real effort to hide these changes from API users by maintaining the TranscriptVariation interface despite significant changes under the hood, but if you are dealing with the database directly then these changes will mean you will have to check that anything that is accessing this table programatically anyway, even if we had kept the column names more consistent. We took a more conservative approach to other tables (e.g. variation_feature) and did not change column names. I am happy to give you more detail about the changes to this part of the schema if that would help.

Sorry for any hassle this has caused.

Graham


On 13 May 2011, at 09:49, Stuart Meacham wrote:

> Hi guys,
> 
> I have a few queries. I am trying to incorporate the latest Ensembl release into RoboSNP. Basically I use 5 tables from homo_sapiens_variation_62_37g (allele, sample, subsnp_handle, transcript_variation and variation_feature). There seems to have been a number of seemingly arbitrary changes to the schemas.
> 
> For example in the transcript_variation table the consequence_type field has been renamed consequence_types, yet this has not been followed through in the variation_feature table which still has a field  called consequence_type. Also in transcript_variation we find peptide_allele_string is now called pep_allele_string. The transcript_stable_id is now rather confusingly called feature_stable_id. What is the rationale for these changes? I understand you need to update things as more, and more refined data, becomes available but shouldn't the default policy be 'conservation' so as the break as few things as possible?
> 
> Cheers
> 
> Stuart
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/





More information about the Dev mailing list