[ensembl-dev] Duplicates in allele table
Stuart Meacham
sm766 at cam.ac.uk
Thu May 12 15:25:57 BST 2011
Hi,
I wondered why there are duplicates in the allele table of the latest
human variation database (homo_sapiens_variation_62_37g)?
E.g.
select allele_id, subsnp_id, allele, frequency, count from allele where
variation_id = 25007232 and sample_id = 908;
+-----------+-----------+--------+-----------+-------+
| allele_id | subsnp_id | allele | frequency | count |
+-----------+-----------+--------+-----------+-------+
| 94403722 | 44080545 | A | 0.808333 | 97 |
| 476949391 | 44080545 | A | 0.808333 | 97 |
| 94403723 | 44080545 | T | 0.191667 | 23 |
| 476949392 | 44080545 | T | 0.191667 | 23 |
+-----------+-----------+--------+-----------+-------+
4 rows in set (0.00 sec)
As you can see the only difference between the entries is the arbitrary
allele_id. Is it 'safe' to delete duplicates where the only difference
appears to be the allele_id?
Cheers
Stuart
More information about the Dev
mailing list