[ensembl-dev] getting watson snps

Fiona Cunningham fiona at ebi.ac.uk
Tue Dec 14 11:35:20 GMT 2010


Dear Andrea,

It shouldn't take more than three days just install the allele table
so you must be doing something wrong there. Out of interest, please
tell me which data you after that is not included in our dumps? The
format for the Watson variants will be the same as the genome variants
here:
ftp://ftp.ensembl.org/pub/current/variation/homo_sapiens/

Next release we will include consequences as well.

Best regards,

Fiona

------------------------------------------------------
Fiona Cunningham
Ensembl Variation Project Leader, EBI
www.ensembl.org
www.lrg-sequence.org
t: 01223 494612 || e: fiona at ebi.ac.uk



On 12 December 2010 14:02, Andrea Edwards <edwardsa at cs.man.ac.uk> wrote:
> Hi
>
> The dump files would be great but I am also retreiving lots of other
> information about the snps with the snps and that might not necessarily be
> in your dump file so i think i have to try other options too.
>
> This is what i have tried so far to get the watson snps and not getting
> anywhere fast :)
>
> 1. Written perl script to download them from ensembl human variation
> database. This works but will take over a month to get all the snps at the
> rate at which it seems to be running and i imagine you'll block my  ip
> address if i leave it running :) Plus I can't leave it a month anyway.
>
> 2. I've tried to install the human variation database locally but that also
> seems to be having problems. It has been installing the allele table now for
> 3 days i think. It is running on a very slow machine but there are far
> bigger tables than the allele table so i dread to think how long they will
> take. I tried to get access to a better machine but i wasn't give enough
> hard disk space but perhaps that will solve the problem! How long should it
> take to install the human variation database (roughly) on a 64 bit linux
> machine with 2 gig of ram and intel xeon @ 2.27GHz? Will it take hours or
> days?
>
> Is there anything else i can try. I do appreciate that the dataset is vast
> and these things will be slow? Perhaps the answer is simply a faster machine
> to install the local database and I am looking into this.
>
> I have already looked at getting the snps from dbsnp or directly from source
> but i need to get information associated with the snps so will have the same
> problems i think of retreiving the associated data even if i got the 'raw
> snps' by other means
>
> Many thanks
>
> On 09/12/2010 16:53, Fiona Cunningham wrote:
>>
>>  Dear Andrea,
>>
>> We will look into producing the dump file of all SNPs in Watson for
>> the next release which should make your life easier. Biomart is really
>> best suited to specific queries and so we should provide dump files
>> where large amounts of information across the entire genome is
>> required.
>>
>> Fiona
>>
>> ------------------------------------------------------
>> Fiona Cunningham
>> Ensembl Variation Project Leader, EBI
>> www.ensembl.org
>> www.lrg-sequence.org
>> t: 01223 494612 || e: fiona at ebi.ac.uk
>>
>>
>>
>> On 9 December 2010 13:46, Andrea Edwards<edwardsa at cs.man.ac.uk>  wrote:
>>>
>>> Dear all
>>>
>>> I've tried downloading watson snps from biomart by a) the whole set and
>>> b)
>>> chromosome by chromosome and i can't get the data. I have tried
>>> requesting
>>> the data by email (no email received) and direct download (download
>>> starts
>>> but at a rate of 1kb per second and times out after about 12 hours/10 mb
>>> downloaded).
>>>
>>> I have written a script to get the watson snps via the perl api but that
>>> is
>>> running and taking hours so I am scared I will get my ip blocked! There
>>> are
>>> 3 million snps and it took an hour to get 3000 i think
>>>
>>> I was thinking of getting the human databases directly but i am awaiting
>>> a
>>> new machine and totally out of disk space. Does anyone you know how big
>>> the
>>> human core and variation databases are when installed?
>>>
>>> thanks a lot
>>>
>>> _______________________________________________
>>> Dev mailing list
>>> Dev at ensembl.org
>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>
>
>




More information about the Dev mailing list