[ensembl-dev] REST API - unable to specify release version in requests

Kurt Wheeler kurt.wheeler91 at gmail.com
Tue May 22 14:47:54 BST 2018


Hmm, I think that does help quite a bit.

Thanks for the quick reply.

- Kurt

On Tue, May 22, 2018 at 5:06 AM, Premanand Achuthan <prem at ebi.ac.uk> wrote:
> Hi Kurt
>
> rest.ensembl.org always points to the latest release of Ensembl. You can use
> the /info/software end point to get the current version of the Ensembl.
>
> http://rest.ensembl.org/info/software?content-type=application/json
>
> {
> release: 92
> }
>
> Ensembl archives previous 5 releases of REST service, which can be accessed
> via the following urls,
>
> http://{ENSEMBL_RELEASE_VERSION}.rest.ensembl.org/
>
> http://e87.rest.ensembl.org/
> http://e88.rest.ensembl.org/
> http://e89.rest.ensembl.org/
> http://e90.rest.ensembl.org/
> http://e91.rest.ensembl.org/
>
> Hope it helps,
>
> Best Regards
> Prem
>
>
> On 21/05/2018 19:35, Kurt Wheeler wrote:
>>
>> TL;DR; I would like to able to specify the release version I am
>> querying for when using the REST API like so:
>>
>>
>> https://rest.ensembl.org/info/species?release=91&content-type=application/json
>>
>> where the key part of that request is `?release=91`
>>
>>
>> Here's why:
>>
>> I don't use Ensembl through the R client, so I can't run:
>>
>> use Bio::EnsEMBL::ApiVersion;
>> printf( "The API version used is %s\n", software_version() );
>>
>> to provide the API version I am using. However my issue is actually
>> directly linked to not being able to specify what version of the REST
>> API I am using. My project uses Ensembl's REST API to build an FTP URL
>> to then download from. Specifically, I use:
>>
>> https://rest.ensembl.org/documentation/info/species
>>
>> and
>>
>>
>> http://rest.ensemblgenomes.org/info/genomes/division/{division}?content-type=application/json
>> (replacing {division} with the division of Ensembl I am trying to
>> access)
>>
>>  From that response, I use a few fields to construct the URL. For
>> example consider this species:
>>
>> {
>>      division: "Ensembl",
>>      taxon_id: "7955",
>>      name: "danio_rerio",
>>      release: 92,
>>      display_name: "Zebrafish",
>>      accession: "GCA_000002035.4",
>>      strain_collection: null,
>>      common_name: "zebrafish",
>>      strain: null,
>>      aliases: [
>>          "drer",
>>          "danio rerio",
>>          "d_rerio",
>>          "danio",
>>          "zebrafish",
>>          "7955",
>>          "danrer",
>>          "drerio",
>>          "zfish"
>>      ],
>>      groups: [
>>          "core",
>>          "otherfeatures",
>>          "rnaseq",
>>          "variation",
>>          "funcgen"
>>      ],
>>      assembly: "GRCz11"
>> }
>>
>> and its corresponding URLs for GTF and FASTA files:
>>
>> ftp://ftp.ensembl.org/pub/release-92/gtf/danio_rerio/Danio_rerio.GRCz11.92.gtf.gz
>>
>> ftp://ftp.ensembl.org/pub/release-92/fasta/danio_rerio/dna/Danio_rerio.GRCz11.dna.toplevel.fa.gz
>>
>> We use a few fields to do this, but as an example consider the
>> `assembly` field. This field changed from `GRCz10` to `GRCz11` between
>> release 91 and 92. Therefore the old URLs for the files I am
>> interested in are:
>>
>>
>> ftp://ftp.ensembl.org/pub/release-91/gtf/danio_rerio/Danio_rerio.GRCz10.91.gtf.gz
>>
>> ftp://ftp.ensembl.org/pub/release-91/fasta/danio_rerio/dna/Danio_rerio.GRCz10.dna.toplevel.fa.gz
>>
>> Since my code was trying to use release 91, but was receiving
>> information about the species from the REST API regarding release 92,
>> it generated the URLs:
>>
>>
>> ftp://ftp.ensembl.org/pub/release-91/gtf/danio_rerio/Danio_rerio.GRCz11.91.gtf.gz
>>
>> ftp://ftp.ensembl.org/pub/release-91/fasta/danio_rerio/dna/Danio_rerio.GRCz11.dna.toplevel.fa.gz
>>
>> which do not exist. The REST API will only return information about
>> the latest release, which means that until I update my Ensembl version
>> to 92, my code breaks because the URL I build doesn't exist.
>>
>> The fact that it breaks my code is annoying, but still manageable.
>> However what concerns me even more is that I think this endangers the
>> reproducibility of data generated by my project. It is very important
>> to my project that users be able to determine exactly what version of
>> everything was used so that if anyone wants to check the validity of
>> our work, we can tell them exactly how to replicate what we did. If
>> the code we're using to build transcriptome indices breaks whenever a
>> new Ensembl version is released, then any data we processed using
>> those transcriptome indices cannot be replicated exactly because
>> there's no way to make our code run with the old release of Ensembl.
>>
>> So to summarize, I am inquiring about the possibility of adding a
>> `release` query parameter to the REST API. For my use case it would be
>> sufficient to have it on the /info/species endpoint, but it seems like
>> it would probably make sense API-wide.
>>
>> At the end of the day, we have figured out workarounds if this query
>> parameter cannot be added so this isn't the end of the world. However
>> I think that this feature seems like something that should be
>> supported anyway.
>>
>> Thanks,
>>
>> - Kurt
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>
>



More information about the Dev mailing list