[ensembl-dev] JSON data on FTP

Sebastien Moretti sebastien.moretti at unil.ch
Mon Sep 30 08:27:30 BST 2024


Hi John

Thanks for all your explanations, and the updates.

The JSON files look to contain everything we need.

Cheers
Sébastien

On 27/09/2024 18:03, John Tate wrote:
> Hi Sébastien,
> 
> These JSON files are a fairly comprehensive dump of the data for a given 
> species. I’d encourage you to have a look at a few examples and see if 
> they have what you need. You'll find them in both Ensembl and 
> EnsemblGenomes releases, going back to E92 and EG35 and in all future 
> releases.
> 
> The checksum file contains the output of “sum”, the BSD checksum 
> calculator, on the JSON file. You’ll see the checksum itself, the file 
> size (in kb), and the filename.
> 
> Thanks for pointing out that there is no description for these files in 
> the READMEs and help and docs.
> 
> I’ve updated the “current_README” files to add a line about the JSON 
> files, and fixed the URL typo at the same time. The change should be 
> reflected in the live FTP areas soon. I’ve also submitted a PR to add 
> the description to the website help and docs page, but that won't appear 
> in the live website until sometime after the upcoming 113 release.
> 
> Cheers,
> 
> John.
> 
> 
>> On 10 Sep 2024, at 10:56, Sebastien Moretti 
>> <sebastien.moretti at unil.ch> wrote:
>>
>> Hi
>>
>> I have just found the Ensembl data in the JSON format on your FTP and 
>> have several questions and comments.
>>
>> - How complete are those JSON files? Are they, kind of, a dump for 
>> each species database?
>> - Will those JSON files stay for long? Are they released now with each 
>> new Ensembl version?
>> - Do Ensembl Genomes provide the same JSON?
>> - What kind of checksum is provided with each JSON file (the CHECKSUMS 
>> file)?
>>
>> Data in JSON are not mentioned on https://www.ensembl.org/info/data/ 
>> ftp/index.html but in the species table, without description.
>> They are also not described in https://ftp.ensembl.org/pub/current_README
>>
>> Best
>>
>> P.S. there is a typo in https://ftp.ensembl.org/pub/current_README
>> Should be *see http://ensembl.org/* instead of *see http;//ensembl.org/*
>>
>> --
>> Sébastien Moretti
>> Staff Scientist
>> Department of Ecology and Evolution,
>> Biophore, University of Lausanne,
>> CH-1015 Lausanne, Switzerland
>> Tel.: +41 (21) 692 4221
>> https://bioinfo.unil.ch/ https://www.bgee.org/ https://selectome.org/
>>
>>
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info: https:// 
>> lists.ensembl.org/mailman/listinfo/dev_ensembl.org
>> Ensembl Blog: http://www.ensembl.info/
> 
> -- 
> John Tate
> e: jgt at ebi.ac.uk
> a: European Bioinformatics Institute (EMBL-EBI), CB10 1SD, UK



More information about the Dev mailing list