[ensembl-dev] Accessing sequence information without using the Perl API

Cook, Malcolm MEC at stowers.org
Wed Jan 3 22:08:30 GMT 2018


Hello Felix et al,

I used to mirror Ensembl locally and found the best way to "retrieve lists of available species and genome builds " is to take advantage of the fact that that the ensemble employs a consistent naming convention for the mysql databases allowing you to determine the species and builds by querying the The INFORMATION_SCHEMA SCHEMATA Table<https://dev.mysql.com/doc/refman/5.7/en/schemata-table.html>.

For other purposes, the main disadvantage of using direct SQL is that the schema might change and your query is not protected from such change.  Nominally, a published API should protect your source code from such a change.   In practice, such changes rarely happen, and limiting yourself to the Perl API you are probably trading one problem for another.  I found the schema to be well documented, quite stable, worth learning, and the most efficient way to access Ensembl, whether my local mirror, or remotely (at some point Ensembl introduced US mirrors making such querying even faster).  I don't know all the pertinent characteristics of your project, but I would probably go the direct access route in most cases.

Good luck,

~Malcolm

From: Dev [mailto:dev-bounces at ensembl.org] On Behalf Of Felix Krueger
Sent: Wednesday, January 3, 2018 2:25 PM
To: dev at ensembl.org
Subject: [ensembl-dev] Accessing sequence information without using the Perl API

Dear all,

I am looking to use the Ensembl API to retrieve lists of available species and genome builds, as well as potentially large numbers of sequence information for sets of coordinates of various genomes. For this Java-based project we cannot use the Perl API, but could potentially use the REST or direct MySQL access. Would anyone be able to give me a pointer to what is likely the best way for me going forward?

Many thanks and a Happy New Year!
Kind regards,
Felix

The Babraham Institute, Babraham Research Campus, Cambridge CB22 3AT Registered Charity No. 1053902.
The information transmitted in this email is directed only to the addressee. If you received this in error, please contact the sender and delete this email from your system. The contents of this e-mail are the views of the sender and do not necessarily represent the views of the Babraham Institute. Full conditions at: www.babraham.ac.uk<http://www.babraham.ac.uk/terms>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20180103/3bae2efc/attachment.html>


More information about the Dev mailing list