[ensembl-dev] Duplicates in allele table
pmeidl at cemm.oeaw.ac.at
Thu May 12 15:43:48 BST 2011
On Thu, May 12 2011, Will McLaren <wm2 at ebi.ac.uk> wrote:
> It is safe to delete them, yes - if you know of a clever way of doing
> this then please share (I've just spent 2 days dumping, splitting,
> unique sorting and reimporting because our server runs out of tmp
> space if I try and do a GROUP BY statement on a table this large)!
in mysql, you can do this:
ALTER IGNORE TABLE allele
ADD UNIQUE INDEX (subsnp_id, allele, frequency, count);
this will only work if none of your columns contains NULLs. also, I
haven't tested this on huge table, so can't comment on performance.
Patrick Meidl, Mag.
Research Centre for Molecular Medicine
of the Austrian Academy of Science
Lazarettgasse 14 / AKH BT 25.3
phone +43 1 40160 70016
email pmeidl at cemm.oeaw.ac.at
More information about the Dev