[ensembl-dev] eQTL API fetch results don't match those on GTEx website

Thomas Juettemann juettemann at ebi.ac.uk
Fri Nov 23 10:05:38 GMT 2018


Hello Shraddha,

The explanation is that we store the nominal p-values, whereas the portal distrbutes the empirical p-values. 

More info from https://gtexportal.org/home/documentationPage#staticTextAnalysisMethods

	• Nominal p-values were generated for each variant-gene pair by testing the alternative hypothesis that the slope of a linear regression model between genotype and expression deviates from 0.
	• The mapping window was defined as 1 megabase up- and downstream of the transcription start site.
	• For each tissue, variants in the VCF were selected based on the following thresholds: the minor allele was observed in at least 10 samples, and the minor allele frequency was ≥ 1%.
	• The adaptive permutations mode was used with the setting “--permute 1000 10000”.
	• Beta distribution-adjusted empirical p-values from FastQTL were used to calculate q-values (Storey & Tibshirani, PNAS, 2003 ), and a false discovery rate (FDR) threshold of ≤0.05 was applied to identify genes with a significant eQTL (“eGenes”).

I hope this explains the different results. Please let don’t hesitate to ask further questions if anything remains unclear.

Best wishes,
Thomas



On 22 Nov 2018, at 15:54, Shraddha Pai <shraddhapai.neuro at gmail.com> wrote:

Hello Ensembl dev community,

I used the Ensembl REST API for the first time yesterday, to fetch eQTL results associated with a set of SNPs. But the results I get from the Ensembl API don't match those from a manual query I performed on the GTEx website.

I used this endpoint: GET eqtl/variant_name/:species/:variant_name 
http://rest.ensembl.org/documentation/info/species_variant
Isn't GTEx the source for these eQTL associations?  

Here is the snippet of Perl code I used to fetch results for a snp:
---
    print "$snp\n";

    my $ping_endpoint = "/eqtl/variant_name/homo_sapiens/$snp";
    my $url = $server.$ping_endpoint;
    my $response = rest_request($url, $headers);
    for my $cur ( @{ $response }) {
        my $tis=$cur->{tissue};
        my $val=$cur->{value};
        my $gene=$cur->{gene};
        my $statistic=$cur->{statistic};
        if  ($statistic=~/p-value/) {
            print OUT "$snp\t$tis\t$gene\t$statistic\t$val\n";
        }
    }
close(OUT) || die "$!";
---

As an example, according to my API call, the only significant eQTL for rs11703062 is this:
rs11703062  Adipose_Visceral_Omentum    ENSG00000100376 p-value 0.0289612051829416

But the GTex page lists many more (see "significant eGenes" table in screenshot).
<Screen Shot 2018-11-22 at 10.49.34 AM.png>

What am I missing here?

Thanks!
Shraddha
----
Shraddha Pai
Post-doctoral Fellow, http://baderlab.org
The Donnelly Center, University of Toronto
_______________________________________________
Dev mailing list    Dev at ensembl.org
Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
Ensembl Blog: http://www.ensembl.info/

—
Thomas Juettemann, PhD
Ensembl Regulation

European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
United Kingdom
Tel: +44 (0)1223 494696





More information about the Dev mailing list