[ensembl-dev] GTF for RNA-seq gene models?

Bronwen Aken ba1 at sanger.ac.uk
Fri Aug 3 10:25:43 BST 2012


Dear Julien,

We do not have GTF files available for the gene models derived from the Human BodyMap 2.0 data. You will be able to get the exon coordinates you need using our API. The script that you mentioned, dump_transcripts.pl, prints the following information in addition to the transcript sequence:

For each gene and transcript:
Exon number
Exon slice name
Exon start
Exon end
Exon strand

This information is printed using the following line:
print INFO "    EXON ".$num_exons." chr ".$exon->slice->seq_region_name." start ".$exon->start." end ".$exon->end." strand ".$exon->strand."\n";

If you don't need the fasta sequence, the script will run faster if you comment this line out :
 #     print FASTA ">".$transcript->stable_id."\n".$transcript->seq->seq."\n";
We are in the process of re-generating the gene models derived from the Human BodyMap 2.0 data, using our improved pipeline.

Hope that helps,
Bronwen


On 23 Jul 2012, at 18:00, Julien Roux wrote:

> Hi all,
> I am wondering if a GTF/GFF file would be available for the gene models derived from RNA-seq of 16 Bodymap tissues?
> I have found the script dump_transcripts.pl (http://cvs.sanger.ac.uk/cgi-bin/viewvc.cgi/ensembl-pipeline/scripts/examples/?root=ensembl), but it outputs the RNA-seq transcripts in FASTA format, without any information on the splice junctions and exon coordinates...
> Thanks for your tips
> Julien
> 
> -- 
> Julien Roux, PhD
> Gilad lab, Department of Human Genetics, University of Chicago
> http://giladlab.uchicago.edu/
> 920 East 58th Street, CLSC 317, Chicago, IL 60637, USA
> tel: +1-773-834-1984   fax: +1-773-834-8470
> 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20120803/148a3f39/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2058 bytes
Desc: not available
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20120803/148a3f39/attachment.p7s>


More information about the Dev mailing list