[ensembl-dev] Triticum aestivum invalid GFF3

Fields, Christopher J cjfields at illinois.edu
Sat Mar 1 02:12:08 GMT 2014


htseq-count doesn’t support GFF3, only older GFF or GTF (note the ‘gene_id’ attribute is very much a GTF thing).  I mention this here (and the reasons why you should be wary):

   http://www.biostars.org/p/83903/

In short, don't trust htseq-count results (particularly for any eukaryotic counts) unless you convert to a supported format.  This is pretty easy to do, actually; the gff_read tool that comes with Cufflinks does a decent job, and there are other tools capable of doing the same.  Also be aware that if you use PE alignments the BAM file must be name-sorted, not coordinate-sorted.

Also, just to point out, there is a much faster C-based multithreaded alternative to htseq-count called featureCounts that takes one or more native BAM files as input (it unfortunately doesn’t support GFF3 either).  Results are pretty much identical to htseq-count.

chris

On Feb 28, 2014, at 7:42 PM, Hans Vasquez-Gross <havasquezgross at ucdavis.edu<mailto:havasquezgross at ucdavis.edu>> wrote:

Hi All,

  I just wanted to also point out another standard gff parsing tool isn't able to work with your gff3 reference with the following error highlighted in bold:

 samtools view -sB Kronos0_KTC1_bwa_ABgenome_20131211.sorted.rmdup.bam | htseq-count - /Volumes/DATA2/users/havasquezgross/Projects/exon_capture/database/Triticum_aestivum.IWGSP1.21.noQs.gff3
Error occured in line 137 of file /Volumes/DATA2/users/havasquezgross/Projects/exon_capture/database/Triticum_aestivum.IWGSP1.21.noQs.gff3.
Error: Feature Traes_3AS_775C097A2.E1 does not contain a 'gene_id' attribute
[Exception type: SystemExit, raised in count.py:55]

Another item to consider for the march release.  I look forward to working with the updated release.

Cheers,
-Hans
_______________________________________________
Dev mailing list    Dev at ensembl.org<mailto:Dev at ensembl.org>
Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20140301/5ade025e/attachment.html>


More information about the Dev mailing list