[ensembl-dev] 1000 Genomes SNPS

Pontus Larsson Pontus.Larsson at ebi.ac.uk
Wed Mar 2 15:22:26 GMT 2011


Hi Chris,

Andrea is right, we have grouped these variations together into various
variation sets. For example, you can get all variations belonging to the
different pilots from the sets '1000 genomes - Low coverage', '1000 genomes
- High coverage - Trios' and '1000 genomes - High coverage exons' for pilot
1,2 and 3, respectively. You'll need to use the VariationSet and
VariationSetAdaptor modules for this. It is not possible to retrieve the
variations conditional on submission date.

As Andrea points out, if you call the 'get_all_Variations' method on a
VariationSet object, the API will create all variation objects and return
them. For large sets like these, this can easily cause you to run out of
memory but you can use the 'get_Variation_Iterator' method to get an
Iterator object and iterate over the variations instead.

/Pontus



2011/3/2 <cj5 at sanger.ac.uk>

> Hi,
> Is it possible using the variations API to get a list of SNPS which have
> been submitted from the 1000 Genomes project?
>
> I have a vague idea that it should be possible to retrieve such a list
> using the SS (submission) ID and/or the validation status, however I am
> unsure of the details and what version of the API should be used.
>
> The latest 100 genomes pilot release (2010_07) would be great, but any
> earlier release would also be useful.
>
> Thanks
> Chris
>
>
> _______________________________________________
> Dev mailing list
> Dev at ensembl.org
> http://lists.ensembl.org/mailman/listinfo/dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20110302/fba69855/attachment.html>


More information about the Dev mailing list