[ensembl-dev] Get ftp url + path from a species name

Andy Yates ayates at ebi.ac.uk
Tue Jan 17 09:51:29 GMT 2012


Hi Céline,

There are a number of elements you can extract from a core database's meta table which can allow you to reconstruct the path the protein dumps. For Ensembl databases you can use:

* species.production_name
* schema_version

This lets you build up a path like:

ftp://ftp.ensembl.org/pub/release-${schema_version}/fasta/${species.production_name}/pep

or for example:

ftp://ftp.ensembl.org/pub/release-65/fasta/ailuropoda_melanoleuca/pep/

The solution is a bit harder for Ensembl Genomes. Each core database has the division under the meta key

* species.division

This can be used but with some manipulation e.g. in d.mel you need to convert the meta value EnsemblMetazoa to metazoa. The issue then is the Ensembl Genomes release which for the moment would have to be a hardcoded Ensembl release -> EG release e.g. E!65 == EG12. You can then build the path up like so:

ftp://ftp.ensemblgenomes.org/pub/release-${eg_release}/${eg_division}/fasta/${species.production_name}/pep/

ftp://ftp.ensemblgenomes.org/pub/release-11/metazoa/fasta/acyrthosiphon_pisum/pep/

I hope this helps you & apologies for the long wait for a reply.

Best regards,

Andy

On 5 Jan 2012, at 16:27, Celine Noirot wrote:

> Hi,
> I'm trying to get the path to download the protein sequence from a species name, but I don't know if the species is in ensembl, plant, bacteria or else ...
> Does there is a way with the API to get from the species name the ftp url and the path to the current version ?
> Best,
> Céline
> -- 
> 
> Céline Noirot
> Plateforme Bioinfo Genotoul- Unité BIA - INRA Toulouse 31326 Castanet-Tolosan
> Tel. 05 61 28 57 24
> http://bioinfo.genotoul.fr
> 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

---
Andrew Yates                   Ensembl Core Software Project Leader
EMBL-EBI                       Tel: +44-(0)1223-492538
Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
Cambridge CB10 1SD, UK         http://www.ensembl.org/





More information about the Dev mailing list