[ensembl-dev] Where I get he same data like REST API to variation with genotype population?

Kieron Taylor ktaylor at ebi.ac.uk
Wed Aug 3 10:28:57 BST 2016


Dear Juliano,

You are correct, that requesting millions of variants at 15 per second would take a very long time. This is why we support batch requests via POST.

https://rest.ensembl.org/documentation/info/variation_post

You can greatly increase your throughput by putting hundreds of IDs in each request, for example:

curl 'https://rest.ensembl.org/variation/homo_sapiens' -H 'Content-type:application/json' -H 'Accept:application/json' -X POST -d '{ "ids" : ["rs56116432", "COSM476", .... ] }'

If that is not suitable (run time measured in a few hours perhaps), we have the original Variation API which can undoubtedly help you get the data directly. Someone from our helpdesk or Variation team might be able to help if you still feel you must query the database directly.

Regards,

Kieron


Kieron Taylor PhD.
Ensembl Developer

EMBL, European Bioinformatics Institute






> On 2 Aug 2016, at 20:51, Juliano Martins <julianovmartins at gmail.com> wrote:
> 
> Hello,
> 
> I am Brazilian student of computer science at the Catholic University of Paraná. I am starting research on genome variants and would like to extract some statistics such data and for that I need to download it to my local machine.
> 
> Using the REST API (
> https://rest.ensembl.org/variation/human/rs56116432?content-type=application/json;population_genotypes=1) I get exactly the data I need, but it would be very time consuming for millions of variants IDs because the service time limits.
> 
> Where I could get the same information (range, population, genotype ) massively?
> 
> I tried to find this data into the public mysql ensembl database and performed several queries, but did not get the same data. Especially the 'population' table and frequency fields does not seems to have the same data I got the in the REST API.
> 
> I need data variation for the assembly GRCh37 3 GRCh38 and looked for these two bases:
> - homo_sapiens_variation_73_37
> - homo_sapiens_variation_85_38
> 
> 
> This is a sample query that I used in the database (homo_sapiens_variation_73_37):
> 
> SELECT DISTINCT 
> variation.name,
> allele.frequency,
> allele.count,
> allele_code.allele,
> variation.ancestral_allele,
> variation.minor_allele,
> variation.minor_allele_freq,
> variation.minor_allele_count,
> population.name
> FROM allele, allele_code, variation, population
> WHERE variation.name LIKE 'rs56116432' AND 
> allele.allele_code_id = allele_code.allele_code_id AND 
> allele.variation_id = variation.variation_id AND
> allele.population_id = population.population_id
> 
> I thank you!.
> Sorry my bad English.
> 
> Juliano V. Martins
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/





More information about the Dev mailing list