[ensembl-dev] variation schema

Will McLaren wm2 at ebi.ac.uk
Wed Dec 22 09:44:05 GMT 2010


Hi Andrea,

Apologies, but the schema document has not been updated to include the
variation set tables - your understanding of them is correct. Sets are a
generic and catch-all way of grouping variations - it allows us to group,
for example, all variants from the HapMap project, or all variants with
phenotypic associations, or in this case all variants called in a particular
individual.

Alleles are linked to populations, and there is a population representing
Watson (the population is named "ENSEMBL:ENSEMBL_Watson" and is of size one,
and has an individual named "Watson"). Thus if a variation belongs to the
Watson set, it should have a pair of alleles linked to the Watson
population.

Cheers

Will

On 21 December 2010 21:06, Andrea Edwards <edwardsa at cs.man.ac.uk> wrote:

> Hi
>
> I have been reading about the variation database schema here
>
> http://www.ensembl.org/info/docs/api/variation/variation_schema.html
>
> but there is no information in this document about the database tables
> that, based on their name, look like they deal with variation sets namely
>
> *variation_set
> *variation_set_structure
> *variation_set_variation
>
> These tables aren't on the pdf schema diagram either.
>
> I was hoping i could get an explanation of these tables.
>
> It looks as though variation_set is simply a variation set with a name and
> description.
>
> It looks then as if variation_set_variation is a simple link table to
> resolve the many to many relationship between a variation and a variation
> set. But if that is the case I don't know how you model the alleles in a
> variation set such as the watson set.
>
> For example a particular variation might be triallelic overall (e.g. in
> every individual looked at) but variations in the the watson variation can
> only be diploid at most. The table that normally describes the alleles of a
> variation and their frequencies  is allele. The allele table links to a
> sample id so you which alleles occur for a variation in a population and you
> know the frequency of a particular allele in that population. The allele
> table doesn't seem to have any link to a variation set.
>
> It looks like there should be a link somewhere between a variation set and
> a population/sample so that the allele table can still represent the
> alleles/frequencies of a variation set
>
> Or i could be guessing this all wrong. Either way, i would really benefit
> from some data about the schema that models variation sets. And I think I
> need  ensembl's definition of a variation set (the POD simply says This is a
> class representing a set of variations that are grouped by e.g. study,
> method, quality measure etc.)
>
> Kind regards
>
> _______________________________________________
> Dev mailing list
> Dev at ensembl.org
> http://lists.ensembl.org/mailman/listinfo/dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20101222/67fa3211/attachment.html>


More information about the Dev mailing list