[ensembl-dev] Fwd: Re: 1000 Genomes SNPS

Paul Flicek flicek at ebi.ac.uk
Wed Mar 2 16:24:36 GMT 2011


Using the most up to date version of the API is normally the  best approach if possible in your analysis.

This is especially true for new new things (like the 1000 Genomes data).


Paul



On 2 Mar 2011, at 16:21, Andrea Edwards wrote:

> 
> 
> I used v61
> 
> C:\Documents and Settings\Administrator>mysql -h ensembldb.en
> ous -P 5306
> Welcome to the MySQL monitor.  Commands end with ; or \g.
> Your MySQL connection id is 41648834
> Server version: 5.1.34-log Source distribution
> 
> Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
> 
> mysql>  use homo_sapiens_variation_61_37f;
> Database changed
> mysql>  select distinct name from variation_set;
> +--------------------------------------------+
> 
> On 02/03/2011 16:17, cj5 at sanger.ac.uk wrote:
>> Hi
>> 
>> Many thanks Andrea and Pontus for your help with this. May I ask which
>> version of the API you are both using to get your variation sets?
>> 
>> I am using version 60, and I got a list of available sets by iterating
>> get_all_sub_VariationSets within fetch_all_top_VariationSets, I found the
>> following 1000 genomes sets only:
>> 1000 genomes
>>          1000 genomes - Low coverage
>>          1000 genomes - Trios - YRI
>>          1000 genomes - Trios - CEU
>> 
>> Thanks
>> Chris
>> 
>>> Hi Chris,
>>> 
>>> Andrea is right, we have grouped these variations together into various
>>> variation sets. For example, you can get all variations belonging to the
>>> different pilots from the sets '1000 genomes - Low coverage', '1000
>>> genomes
>>> - High coverage - Trios' and '1000 genomes - High coverage exons' for
>>> pilot
>>> 1,2 and 3, respectively. You'll need to use the VariationSet and
>>> VariationSetAdaptor modules for this. It is not possible to retrieve the
>>> variations conditional on submission date.
>>> 
>>> As Andrea points out, if you call the 'get_all_Variations' method on a
>>> VariationSet object, the API will create all variation objects and return
>>> them. For large sets like these, this can easily cause you to run out of
>>> memory but you can use the 'get_Variation_Iterator' method to get an
>>> Iterator object and iterate over the variations instead.
>>> 
>>> /Pontus
>>> 
>>> 
>>> 
>>> 2011/3/2<cj5 at sanger.ac.uk>
>>> 
>>>> Hi,
>>>> Is it possible using the variations API to get a list of SNPS which have
>>>> been submitted from the 1000 Genomes project?
>>>> 
>>>> I have a vague idea that it should be possible to retrieve such a list
>>>> using the SS (submission) ID and/or the validation status, however I am
>>>> unsure of the details and what version of the API should be used.
>>>> 
>>>> The latest 100 genomes pilot release (2010_07) would be great, but any
>>>> earlier release would also be useful.
>>>> 
>>>> Thanks
>>>> Chris
>>>> 
>>>> 
>>>> _______________________________________________
>>>> Dev mailing list
>>>> Dev at ensembl.org
>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>> 
>> 
>> 
>> _______________________________________________
>> Dev mailing list
>> Dev at ensembl.org
>> http://lists.ensembl.org/mailman/listinfo/dev
> 
> 
> 
> _______________________________________________
> Dev mailing list
> Dev at ensembl.org
> http://lists.ensembl.org/mailman/listinfo/dev





More information about the Dev mailing list