[ensembl-dev] JSON data on FTP

John Tate jgt at ebi.ac.uk
Fri Sep 27 17:03:28 BST 2024


Hi Sébastien,

These JSON files are a fairly comprehensive dump of the data for a given species. I’d encourage you to have a look at a few examples and see if they have what you need. You'll find them in both Ensembl and EnsemblGenomes releases, going back to E92 and EG35 and in all future releases.

The checksum file contains the output of “sum”, the BSD checksum calculator, on the JSON file. You’ll see the checksum itself, the file size (in kb), and the filename.

Thanks for pointing out that there is no description for these files in the READMEs and help and docs.

I’ve updated the “current_README” files to add a line about the JSON files, and fixed the URL typo at the same time. The change should be reflected in the live FTP areas soon. I’ve also submitted a PR to add the description to the website help and docs page, but that won't appear in the live website until sometime after the upcoming 113 release.

Cheers,

John.


> On 10 Sep 2024, at 10:56, Sebastien Moretti <sebastien.moretti at unil.ch> wrote:
> 
> Hi
> 
> I have just found the Ensembl data in the JSON format on your FTP and have several questions and comments.
> 
> - How complete are those JSON files? Are they, kind of, a dump for each species database?
> - Will those JSON files stay for long? Are they released now with each new Ensembl version?
> - Do Ensembl Genomes provide the same JSON?
> - What kind of checksum is provided with each JSON file (the CHECKSUMS file)?
> 
> Data in JSON are not mentioned on https://www.ensembl.org/info/data/ftp/index.html but in the species table, without description.
> They are also not described in https://ftp.ensembl.org/pub/current_README
> 
> Best
> 
> P.S. there is a typo in https://ftp.ensembl.org/pub/current_README
> Should be *see http://ensembl.org/* instead of *see http;//ensembl.org/*
> 
> --
> Sébastien Moretti
> Staff Scientist
> Department of Ecology and Evolution,
> Biophore, University of Lausanne,
> CH-1015 Lausanne, Switzerland
> Tel.: +41 (21) 692 4221
> https://bioinfo.unil.ch/ https://www.bgee.org/ https://selectome.org/
> 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org
> Ensembl Blog: http://www.ensembl.info/

-- 
John Tate
e: jgt at ebi.ac.uk
a: European Bioinformatics Institute (EMBL-EBI), CB10 1SD, UK

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20240927/e7e30c8c/attachment.html>


More information about the Dev mailing list