[ensembl-dev] querying a copy of the ensembl human variation 72_37

Jayaraman, Pushkala pjayaraman at mcw.edu
Thu Sep 5 21:04:48 BST 2013


This works!! 
Aaah.. I see your modified query.. 

AND fv.variation_id IS NULL;

I guess that's where I was going wrong.. 
Thanks again!!


Pushkala

-----Original Message-----
From: dev-bounces at ensembl.org [mailto:dev-bounces at ensembl.org] On Behalf Of Andy Yates
Sent: Thursday, September 05, 2013 3:01 PM
To: Ensembl developers list
Subject: Re: [ensembl-dev] querying a copy of the ensembl human variation 72_37

Hi,

If you have loaded the human variation database using the files in ftp://ftp.ensembl.org/pub/release-72/mysql/homo_sapiens_variation_72_37/ and used the homo_sapiens_variation_72_37.sql.gz file to load your schema there should be no problems. All table structures and indexes required for Ensembl are specified in these files. 

However I think you just need to make some changes to your query to make it faster

select
vf.variation_name, vf.seq_region_id, vf.seq_region_start, vf.source_id, vf.minor_allele_freq, tv.feature_stable_id, tv.allele_string, tv.consequence_types, s.name FROM variation_feature vf JOIN transcript_variation tv USING (variation_feature_id) JOIN source s USING (source_id) LEFT JOIN failed_variation fv ON (vf.variation_id = fv.variation_id) WHERE tv.feature_stable_id IN ('ENST00000380152','ENST00000530893','ENST00000528762','ENST00000470094','ENST00000533776','ENST00000544455')
AND fv.variation_id IS NULL;

This comes back with 13,584 rows in ~13 seconds on ensembldb.ensembl.org *. Can you try the re-written query on your DB please.

Andy

------------
Andrew Yates - Ensembl Core Software Project Leader European Bioinformatics Institute (EMBL-EBI) European Molecular Biology Laboratory Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD
Tel: +44-(0)1223-492538
Fax: +44-(0)1223-494468
http://www.ensembl.org/

On 5 Sep 2013, at 20:38, "Jayaraman, Pushkala" <pjayaraman at mcw.edu> wrote:

> Heres my query:
>  
> select
> vf.variation_name, vf.seq_region_id, vf.seq_region_start, 
> vf.source_id, s.name, vf.minor_allele_freq, tv.feature_stable_id, 
> tv.allele_string, tv.consequence_types from variation_feature vf, 
> transcript_variation tv, source s WHERE s.source_id=vf.source_id and 
> vf.variation_feature_id=tv.variation_feature_id and 
> --tv.feature_stable_id ='ENST00000544455'
> tv.feature_stable_id in
> ('ENST00000380152','ENST00000530893','ENST00000528762','ENST0000047009
> 4','ENST00000533776','ENST00000544455')
> and
> vf.variation_id not in
> (
> select
> vf.variation_id
> from
> variation_feature vf,
> transcript_variation tv,
> failed_variation fv
> where
> fv.variation_id=vf.variation_id
> and
> vf.variation_feature_id=tv.variation_feature_id
> and
> --tv.feature_stable_id ='ENST00000544455'
> tv.feature_stable_id in
> ('ENST00000380152','ENST00000530893','ENST00000528762','ENST0000047009
> 4','ENST00000533776','ENST00000544455')
> )
>  
>  
>  
> When I try this on my local copy.. it runs for 5 mins with no result!
>  
> From: dev-bounces at ensembl.org [mailto:dev-bounces at ensembl.org] On 
> Behalf Of Jayaraman, Pushkala
> Sent: Thursday, September 05, 2013 2:25 PM
> To: dev at ensembl.org
> Subject: [ensembl-dev] querying a copy of the ensembl human variation 
> 72_37
>  
> Hello,
> I seem to be having trouble querying the failed_variation table I 
> cannot even seem to get count of the rows..
> Is there any additional indexes you guys apply to the failed_variation table?
>  
> Pushkala Jayaraman
> Programmer/Analyst - Rat Genome Database Human and Molecular Genetics 
> Center Medical College of Wisconsin
> 414-955-2229
> http://rgd.mcw.edu
>  
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: 
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/


_______________________________________________
Dev mailing list    Dev at ensembl.org
Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
Ensembl Blog: http://www.ensembl.info/




More information about the Dev mailing list