[ensembl-dev] Triticum aestivum invalid GFF3
Fields, Christopher J
cjfields at illinois.edu
Sat Mar 1 02:12:08 GMT 2014
htseq-count doesn’t support GFF3, only older GFF or GTF (note the ‘gene_id’ attribute is very much a GTF thing). I mention this here (and the reasons why you should be wary):
http://www.biostars.org/p/83903/
In short, don't trust htseq-count results (particularly for any eukaryotic counts) unless you convert to a supported format. This is pretty easy to do, actually; the gff_read tool that comes with Cufflinks does a decent job, and there are other tools capable of doing the same. Also be aware that if you use PE alignments the BAM file must be name-sorted, not coordinate-sorted.
Also, just to point out, there is a much faster C-based multithreaded alternative to htseq-count called featureCounts that takes one or more native BAM files as input (it unfortunately doesn’t support GFF3 either). Results are pretty much identical to htseq-count.
chris
On Feb 28, 2014, at 7:42 PM, Hans Vasquez-Gross <havasquezgross at ucdavis.edu<mailto:havasquezgross at ucdavis.edu>> wrote:
Hi All,
I just wanted to also point out another standard gff parsing tool isn't able to work with your gff3 reference with the following error highlighted in bold:
samtools view -sB Kronos0_KTC1_bwa_ABgenome_20131211.sorted.rmdup.bam | htseq-count - /Volumes/DATA2/users/havasquezgross/Projects/exon_capture/database/Triticum_aestivum.IWGSP1.21.noQs.gff3
Error occured in line 137 of file /Volumes/DATA2/users/havasquezgross/Projects/exon_capture/database/Triticum_aestivum.IWGSP1.21.noQs.gff3.
Error: Feature Traes_3AS_775C097A2.E1 does not contain a 'gene_id' attribute
[Exception type: SystemExit, raised in count.py:55]
Another item to consider for the march release. I look forward to working with the updated release.
Cheers,
-Hans
_______________________________________________
Dev mailing list Dev at ensembl.org<mailto:Dev at ensembl.org>
Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
Ensembl Blog: http://www.ensembl.info/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20140301/5ade025e/attachment.html>
More information about the Dev
mailing list