[ensembl-dev] problem importing GenBank file into local core DB

梁薰文 a24681012142002 at gmail.com
Thu Jul 7 03:22:47 BST 2016


Hello, 

I'm a beginner of Ensembl pipeline. I was trying to import the annotation from the GenBank files for prokaryotic (circular) genomes and plasmids into a local Ensembl core DB. I noticed that some genes spanned across the origin (the starting site of genome, eg. " join(185894..187571, 1..1172)" ) and this situation caused the import failed. 

Here is the error message I got: "MSG: Start (185894) must be less than or equal to end+1 (1172)". 

So far, my get-around measures was to modify the GenBank file. I’ve used Bio::SeqIO to shift the positions of all the genes along the genome and thus no more genes spanned the origin. However, this approach caused the inconsistency between my local DB and the original GenBank record.

Are there more formal ways to resolve the thing about gene positioning at a circular genome origin when importing gene annotation into Ensembl Core DB?

Thank you very much for your kind help.

Susan



More information about the Dev mailing list