[ensembl-dev] DNA sequence filenames in Ensembl Metazoa FTP without release number

James Allen jallen at ebi.ac.uk
Fri Dec 11 09:34:42 GMT 2020

The file naming conventions are different for different types of files; release 
49 has the same methodology as previous releases (but it isn't entirely 
consistent, which doesn't help when trying to explain it...)

Assembly-related files, such as the DNA in FASTA format, do not have the release 
number in the filename, but annotation-related files, such as the GTF, do [*]. 
This is the same for Ensembl and Ensembl Genomes, but the latter uses the EG 
release number (e.g. 49) rather than the Ensembl release number (e.g. 102).

The rationale behind this is that filenames include the assembly name, so 
assembly-related updates generate filenames that are easily distinguished from 
past assemblies/releases. The annotation on an assembly (where "annotation" 
includes things like cross-references, as well as genes), can change from one 
release to the next, so to avoid files having different content but the same 
filename in different releases, the release number is included in the name.

Ensembl Production

[*] The exceptions to this rule are the annotation-related FASTA files, 
containing cDNA or peptide sequences - the content will change when a geneset is 
updated, but they do not have a release number in the filename.

On 09/12/2020 13:52, Sebastien Moretti wrote:
> Hi
> I wonder why the release number disappeared from the DNA sequence filenames in 
> the Ensembl Metazoa FTP.
> e.g. for D. simulans:
> ftp://ftp.ensemblgenomes.org/pub/metazoa/release-49/fasta/drosophila_simulans/dna/Drosophila_simulans.ASM75419v3.dna.toplevel.fa.gz 
> -> no .49
> The GTF filenames still contain the release number.
> e.g. 
> ftp://ftp.ensemblgenomes.org/pub/metazoa/release-49/gtf/drosophila_simulans/Drosophila_simulans.ASM75419v3.49.gtf.gz 
> -> .49 is present
> And this is not the case in the main Ensembl FTP.
> Best
> -- 
> Sébastien Moretti
> Staff Scientist
> Department of Ecology and Evolution,
> Biophore, University of Lausanne,
> CH-1015 Lausanne, Switzerland
> Tel.: +41 (21) 692 4221/4079
> http://bioinfo.unil.ch/ http://bgee.org/ http://selectome.unil.ch/
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: 
> https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org
> Ensembl Blog: http://www.ensembl.info/

More information about the Dev mailing list