[ensembl-dev] DNA sequence filenames in Ensembl Metazoa FTP without release number
James Allen
jallen at ebi.ac.uk
Fri Dec 11 09:34:42 GMT 2020
Hello,
The file naming conventions are different for different types of files; release
49 has the same methodology as previous releases (but it isn't entirely
consistent, which doesn't help when trying to explain it...)
Assembly-related files, such as the DNA in FASTA format, do not have the release
number in the filename, but annotation-related files, such as the GTF, do [*].
This is the same for Ensembl and Ensembl Genomes, but the latter uses the EG
release number (e.g. 49) rather than the Ensembl release number (e.g. 102).
The rationale behind this is that filenames include the assembly name, so
assembly-related updates generate filenames that are easily distinguished from
past assemblies/releases. The annotation on an assembly (where "annotation"
includes things like cross-references, as well as genes), can change from one
release to the next, so to avoid files having different content but the same
filename in different releases, the release number is included in the name.
Cheers,
James
Ensembl Production
[*] The exceptions to this rule are the annotation-related FASTA files,
containing cDNA or peptide sequences - the content will change when a geneset is
updated, but they do not have a release number in the filename.
On 09/12/2020 13:52, Sebastien Moretti wrote:
> Hi
>
> I wonder why the release number disappeared from the DNA sequence filenames in
> the Ensembl Metazoa FTP.
> e.g. for D. simulans:
> ftp://ftp.ensemblgenomes.org/pub/metazoa/release-49/fasta/drosophila_simulans/dna/Drosophila_simulans.ASM75419v3.dna.toplevel.fa.gz
>
> -> no .49
>
>
> The GTF filenames still contain the release number.
> e.g.
> ftp://ftp.ensemblgenomes.org/pub/metazoa/release-49/gtf/drosophila_simulans/Drosophila_simulans.ASM75419v3.49.gtf.gz
>
> -> .49 is present
>
> And this is not the case in the main Ensembl FTP.
>
> Best
>
> --
> Sébastien Moretti
> Staff Scientist
> Department of Ecology and Evolution,
> Biophore, University of Lausanne,
> CH-1015 Lausanne, Switzerland
> Tel.: +41 (21) 692 4221/4079
> http://bioinfo.unil.ch/ http://bgee.org/ http://selectome.unil.ch/
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org
> Ensembl Blog: http://www.ensembl.info/
More information about the Dev
mailing list