[ensembl-dev] Missing files on the FTP
reham
reham at ebi.ac.uk
Fri Sep 12 16:09:03 BST 2025
Dear Matthieu,
Hope you are well ?
Thanks for writing in about this. We are aware of most of these issues,
and are currently working on a new FTP structure.
Although there is a chance that some of the paths are generated in
error, the vast majority of missing files will be regenerated. As for
the opposite, it shouldn't happen except on the day of releasing new
files.
We are currently working on checks to ensure this doesn't happen, but we
greatly appreciate the notice.
Best wishes,
Reham
On 2025-09-12 15:08, Matthieu Muffato wrote:
> Dear Ensembl team,
>
> I parse https://ftp.ebi.ac.uk/pub/ensemblorganisms/species.json [1] to
> find out what can be retrieved the FTP. I would like to report that
> among the ~65k files referenced, ~3.5k don't seem to exist.
>
> For instance:
>
> "chromosomes.tsv.gz":
> "Kalanchoe_fedtschenkoi/GCA_002312845.1/genome/chromosomes.tsv.gz",
>
> The directory
> https://ftp.ebi.ac.uk/pub/ensemblorganisms/Kalanchoe_fedtschenkoi/GCA_002312845.1/genome/
> [2] exists but it has no chromosomes.tsv.gz file
>
> I see that across a variety of files (cdna.fa.gz, genes.embl.gz,
> regulation.gff, variation.vcf.gz, etc). Sometimes the entire species
> directory is missing:
>
> "genes.gtf.gz":
> "Melinaea_menophilus_n_ssp_AW-2005/GCA_918358695.1/ensembl/geneset/2022_07/genes.gtf.gz",
>
>
> There is no
> https://ftp.ebi.ac.uk/pub/ensemblorganisms/Melinaea_menophilus_n_ssp_AW-2005/
> [3]
>
> I’d like to understand if the files are genuinely missing and may be
> added later, or perhaps species.json was malformed in the first place.
> I also wonder if the opposite may happen: files present on the FTP but
> not listed in species.json
>
> Kind regards,
>
> Matthieu (he/him)
>
> --
>
> Informatics Infrastructure Team Lead – Tree of Life programme
>
> Wellcome Sanger Institute
>
>
>
> Links:
> ------
> [1] https://ftp.ebi.ac.uk/pub/ensemblorganisms/species.json
> [2]
> https://ftp.ebi.ac.uk/pub/ensemblorganisms/Kalanchoe_fedtschenkoi/GCA_002312845.1/genome/
> [3]
> https://ftp.ebi.ac.uk/pub/ensemblorganisms/Melinaea_menophilus_n_ssp_AW-2005/
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org
> Ensembl Blog: http://www.ensembl.info/
More information about the Dev
mailing list