[ensembl-dev] Error on using parse_ncbi_gff3.pl

Thibaut Hourlier thibaut at ebi.ac.uk
Wed Sep 21 14:18:18 BST 2016


Hi David,
Unfortunately NCBI does not always write their GFF the same way for all their species so a fix for a species could bring a bug for another species.
Could you please tell us which branch/last commit for your ensembl-pipeline and ensembl-io repositories?

Thanks
Thibaut

> On 20 Sep 2016, at 13:26, Daniel Barrell <daniel.barrell at eaglegenomics.com> wrote:
> 
> Odd, D. rerio should have also failed then if my suspicions were correct. Guess there must be something else going on here.
> 
> Dan
> 
> 
> 
> 
> 
> Daniel Barrell
> Platform Specialist
> <E_Email_Sig.jpg>
> eaglediscover Best of Show Winner at Bio-IT World 2016
> 
> Eagle Genomics Ltd
> T: +44 (0)1223 654481 <>
> http://www.eaglegenomics.com <http://www.eaglegenomics.com/>  
> Disclaimer: http://www.eaglegenomics.com/about/privacy-statement/ <http://www.eaglegenomics.com/about/privacy-statement/> 
> 
> https://youtu.be/rPdgFTo0FZM <https://youtu.be/rPdgFTo0FZM> 
> 
> On 20 September 2016 at 12:14, Herzig, David <david.herzig at roche.com <mailto:david.herzig at roche.com>> wrote:
> Hi Daniel
> 
> Thx for the feedback.
> 
> I was able to use it for:
> - d rerio
> - m musculus
> - r norvegicus
> 
> regards,
> David
> 
> 
> On Tue, Sep 20, 2016 at 1:11 PM, Daniel Barrell <daniel.barrell at eaglegenomics.com <mailto:daniel.barrell at eaglegenomics.com>> wrote:
> Hi David,
> 
> Line 1184334 is the last line of the GFF3 file and contains '###'. There used to be code to ignore lines like these:
> 
> +      next if $line =~ /^#/;
> 
> When the script moved to use ensembl-io I think it may have lost this check, however I would expect ensembl-io to handle the '###'. Which species files worked? I checked on NCBI and other species (e.g. horse) would also fail in the same way.
> 
> Dan
> 
> 
> 
> 
> 
> 
> Daniel Barrell
> Platform Specialist
> <E_Email_Sig.jpg>
> eaglediscover Best of Show Winner at Bio-IT World 2016
> 
> Eagle Genomics Ltd
> T: +44 (0)1223 654481 <>
> http://www.eaglegenomics.com <http://www.eaglegenomics.com/>  
> Disclaimer: http://www.eaglegenomics.com/about/privacy-statement/ <http://www.eaglegenomics.com/about/privacy-statement/> 
> 
> https://youtu.be/rPdgFTo0FZM <https://youtu.be/rPdgFTo0FZM> 
> 
> On 20 September 2016 at 11:16, Herzig, David <david.herzig at roche.com <mailto:david.herzig at roche.com>> wrote:
> Hi Ensembl Users
> 
> I have setup the ensembl environment for several species. Everything is ok.
> After that I imported data from NCBI by using the parse_ncbi_gff3.pl <http://parse_ncbi_gff3.pl/> script. Works fine for almost all species. But for the specie sus scrofa I have the following issue:
> 
> I downloaded the file from NCBI:
> /ftp.ncbi.nlm.nih.gov/genomes/Sus_scrofa/GFF/ref_Sscrofa10.2_top_level.gff3 <http://ftp.ncbi.nlm.nih.gov/genomes/Sus_scrofa/GFF/ref_Sscrofa10.2_top_level.gff3>
> 
> I used the parse_ncbi_gff3.pl <http://parse_ncbi_gff3.pl/> script to import it.
> 
> The process starts successfully but after a while I get the following error message and the process stops:
> 
> Can't call method "phase" on an undefined value at /home/ensembl/release-85/ensembl-pipeline/scripts/refseq_import/parse_ncbi_gff3.pl <http://parse_ncbi_gff3.pl/> line 882, <__ANONIO__> line 1184334.
> 
> Any ideas?
> 
> regards,
> David
> 
> 
> -- 
> David Herzig
> Scientist, pRED Informatics
> Roche Pharma Research and Early Development
> 
> Roche Innovation Center Basel
> 
> F. Hoffmann-La Roche Ltd
> Grenzacherstrasse 124
> 4070 Basel
> Switzerland
> Phone +41 61 687 31 70 <tel:%2B41%2061%20687%2031%2070>
> Learn more about pRED Informatics at http://go.roche.com/pREDi <http://go.roche.com/pREDi>
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org <mailto:Dev at ensembl.org>
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev <http://lists.ensembl.org/mailman/listinfo/dev>
> Ensembl Blog: http://www.ensembl.info/ <http://www.ensembl.info/>
> 
> 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org <mailto:Dev at ensembl.org>
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev <http://lists.ensembl.org/mailman/listinfo/dev>
> Ensembl Blog: http://www.ensembl.info/ <http://www.ensembl.info/>
> 
> 
> 
> 
> -- 
> David Herzig
> Scientist, pRED Informatics
> Roche Pharma Research and Early Development
> 
> Roche Innovation Center Basel
> 
> F. Hoffmann-La Roche Ltd
> Grenzacherstrasse 124
> 4070 Basel
> Switzerland
> Phone +41 61 687 31 70 <tel:%2B41%2061%20687%2031%2070>
> Learn more about pRED Informatics at http://go.roche.com/pREDi <http://go.roche.com/pREDi>
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org <mailto:Dev at ensembl.org>
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev <http://lists.ensembl.org/mailman/listinfo/dev>
> Ensembl Blog: http://www.ensembl.info/ <http://www.ensembl.info/>
> 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20160921/6f7ec12a/attachment.html>


More information about the Dev mailing list