[ensembl-dev] Downloading annotation from Ensembl
Giulietta
gspudich at ebi.ac.uk
Wed Jan 22 15:14:11 GMT 2014
On 21/01/2014 15:05, Greg Slodkowicz wrote:
> Hi Fiona,
>
> We do have GTF files for our gene annotations. You can see a table
> of all our download files here:
> http://www.ensembl.org/info/data/ftp/index.html
>
>
> Thanks for getting back to me. I've previously downloaded the GTF
> file under "Gene sets" but it seems that it contained coordinates of
> genes, exons and CDS but not other functional features such as
> domains, secondary structure predictions etc.. Is this the file
> you're referring to?
Hi Greg,
Apologies, I am not completely clear on what information you would like
to access. The nature paper focuses on conserved regions and potential
gene regulatory sequences, so I will address how to find that
information in Ensembl:
The Perl API allows programmatic access to conserved and potentially
functional regions of the genome.
Installation instructions are here:
http://www.ensembl.org/info/docs/api/api_installation.html
For conserved regions ('constrained elements') the Compara Perl API is
an option- have a look at this tutorial:
http://www.ensembl.org/info/docs/api/compara/compara_tutorial.html
If you are looking for hypersensitive sites, transcription binding
sites, and histone modifications, go via the Regulation API:
http://www.ensembl.org/info/docs/api/funcgen/regulation_tutorial.html
There is documentation for both our Compara and Regulation resources here:
http://www.ensembl.org/info/genome/compara/index.html
and
http://www.ensembl.org/info/genome/funcgen/index.html
****
If you are looking for protein domains from InterProscan, and
coiled-coil regions from the ncoils program, you can do so through
BioMart or the Perl API.
BioMart provides an interface for programmers and non-programmers
alike. To learn a bit about how to use the web interface, watch our
quick tutorial:
BioMart: An Introduction
http://youtu.be/DXPaBdPM2vs
You would want to use 'Filters' in the 'PROTEIN DOMAINS' section.
The API access would be through the Core API.
****
Let us know if you have any trouble accessing the information you want,
or if you need some further explanation on specific data types.
Best wishes,
Giulietta
> What are your features? If they are variation data you could use
> our variant effect predictor (VEP) http://www.ensembl.org/VEP.
>
>
> They're sitewise predictions of evolutionary constraint, very similar
> to those from
> http://www.nature.com/nature/journal/v478/n7370/abs/nature10530.html.
>
> Best,
> Greg
>
> --
> Greg Slodkowicz
> PhD student, Nick Goldman group
> European Bioinformatics Institute (EMBL-EBI)
>
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20140122/eb0e8776/attachment.html>
More information about the Dev
mailing list