[ensembl-dev] Patching an Ensembl compare database from 67 to 74

Matthieu Muffato muffato at ebi.ac.uk
Fri Jan 17 17:49:35 GMT 2014


Dear Jan,

There should be a single "exon bounded" sequence for each member_id and 
the API would anyway select only one of them.
You can remove the duplicated entries with this query:

DELETE FROM sequence_exon_bounded WHERE sequence_exon_bounded_id IN 
(SELECT sequence_exon_bounded_id FROM sequence_exon_bounded GROUP BY 
member_id HAVING COUNT(*) > 1);

You may have to run it several times if some member_ids have 3 or more 
sequences


Regarding Will's comment: Compara indeed only started distributing the 
patches as individual patches with the according meta keys a few 
releases ago.

Best,
Matthieu

On 17/01/14 10:08, Will Chow wrote:
> Hi Jan
>
> I noticed warnings as well patching from 66->73 for compara for a few
> tables ( I don't quite remember which ones ).
>
> Of course this doesn't really help you or solve your issues, but since
> our group doesn't use those tables I just ignored the patch warnings and
> for our purposes, it seems to work.   I do notice in the meta table the
> patch keys for the updates are not there below e70 even though the
> patches are in the sql directory.
>
> again doesn't really help you I guess.
>
> Will
>
> On 17 Jan 2014, at 06:49, Jan Vogel <jan.vogel at gmail.com
> <mailto:jan.vogel at gmail.com>> wrote:
>
>>
>> Hi there,
>>
>> i'm trying to patch our ensembl 67 compara database up to schema 74.
>>
>> It fails :-( when applying the
>> *ensembl-compara/sql/patch_67_68.sql* patch as in the 67 schema there
>> are duplicate member ids in the sequence_exon_bounded table. Any idea
>> how to fix this and patch my e67 up to e68 ?
>>
>> Thanks,
>>
>>    Jan Vogel
>>
>>
>> ERROR 1062 (23000) at line 28 in file:
>> '/gnet/resfiles3/resapps/apps/ensembl/ensembl-74/ensembl-compara/sql/patch_67_68.sql':
>> Duplicate entry '2012449' for key 'PRIMARY' :
>>
>> --------------
>> SET session sql_mode='TRADITIONAL'
>> --------------
>> ……
>> …..
>>
>>
>> --------------
>> ALTER TABLE sequence_exon_bounded
>> DROP PRIMARY KEY, DROP KEY member_id,
>> DROP COLUMN sequence_exon_bounded_id,
>> ADD PRIMARY KEY (member_id)
>> --------------
>>
>> Failed to apply patch 'patch_67_68.sql' to database 'ensembl_compara_67'!
>> In non-interative mode; aborting current run at schema_patcher.pl line
>> 643.
>>
>>
>> Looking into the sequence_exon_bounded table, it looks that some
>> member ids are duplicated:
>>
>> mysql> select count(distinct(member_id))  from sequence_exon_bounded
>> limit 10 ;
>> +----------------------------+
>> | count(distinct(member_id)) |
>> +----------------------------+
>> |                    1371152 |
>> +----------------------------+
>> 1 row in set (10.55 sec)
>>
>> mysql> select count(member_id)  from sequence_exon_bounded limit 10 ;
>> +------------------+
>> | count(member_id) |
>> +------------------+
>> |          1373719 |
>> +------------------+
>>
>>
>>
>> _______________________________________________
>> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>


-- 
Matthieu Muffato, Ph.D.
Ensembl Developer and Ensembl Compara Manager
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus, Hinxton
Cambridge, CB10 1SD, United Kingdom
Room  A3-145
Phone + 44 (0) 1223 49 4631
Fax   + 44 (0) 1223 49 4468




More information about the Dev mailing list