[ensembl-dev] eQTL API fetch results don't match those on GTEx website
Thomas Juettemann
juettemann at ebi.ac.uk
Fri Nov 23 10:05:38 GMT 2018
Hello Shraddha,
The explanation is that we store the nominal p-values, whereas the portal distrbutes the empirical p-values.
More info from https://gtexportal.org/home/documentationPage#staticTextAnalysisMethods
• Nominal p-values were generated for each variant-gene pair by testing the alternative hypothesis that the slope of a linear regression model between genotype and expression deviates from 0.
• The mapping window was defined as 1 megabase up- and downstream of the transcription start site.
• For each tissue, variants in the VCF were selected based on the following thresholds: the minor allele was observed in at least 10 samples, and the minor allele frequency was ≥ 1%.
• The adaptive permutations mode was used with the setting “--permute 1000 10000”.
• Beta distribution-adjusted empirical p-values from FastQTL were used to calculate q-values (Storey & Tibshirani, PNAS, 2003 ), and a false discovery rate (FDR) threshold of ≤0.05 was applied to identify genes with a significant eQTL (“eGenes”).
I hope this explains the different results. Please let don’t hesitate to ask further questions if anything remains unclear.
Best wishes,
Thomas
On 22 Nov 2018, at 15:54, Shraddha Pai <shraddhapai.neuro at gmail.com> wrote:
Hello Ensembl dev community,
I used the Ensembl REST API for the first time yesterday, to fetch eQTL results associated with a set of SNPs. But the results I get from the Ensembl API don't match those from a manual query I performed on the GTEx website.
I used this endpoint: GET eqtl/variant_name/:species/:variant_name
http://rest.ensembl.org/documentation/info/species_variant
Isn't GTEx the source for these eQTL associations?
Here is the snippet of Perl code I used to fetch results for a snp:
---
print "$snp\n";
my $ping_endpoint = "/eqtl/variant_name/homo_sapiens/$snp";
my $url = $server.$ping_endpoint;
my $response = rest_request($url, $headers);
for my $cur ( @{ $response }) {
my $tis=$cur->{tissue};
my $val=$cur->{value};
my $gene=$cur->{gene};
my $statistic=$cur->{statistic};
if ($statistic=~/p-value/) {
print OUT "$snp\t$tis\t$gene\t$statistic\t$val\n";
}
}
close(OUT) || die "$!";
---
As an example, according to my API call, the only significant eQTL for rs11703062 is this:
rs11703062 Adipose_Visceral_Omentum ENSG00000100376 p-value 0.0289612051829416
But the GTex page lists many more (see "significant eGenes" table in screenshot).
<Screen Shot 2018-11-22 at 10.49.34 AM.png>
What am I missing here?
Thanks!
Shraddha
----
Shraddha Pai
Post-doctoral Fellow, http://baderlab.org
The Donnelly Center, University of Toronto
_______________________________________________
Dev mailing list Dev at ensembl.org
Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
Ensembl Blog: http://www.ensembl.info/
—
Thomas Juettemann, PhD
Ensembl Regulation
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
United Kingdom
Tel: +44 (0)1223 494696
More information about the Dev
mailing list