[ensembl-dev] multiple stops in Taestivum annotation [MIPS v2.1, Ensembl v22]

Arnaud Kerhornou arnaud at ebi.ac.uk
Thu May 1 12:20:06 BST 2014


Hi Ksenia,

Further to Will's reply, I had a look at the gff3 we provide, and the 
warnings you get.
It seems that they only occur for the first exon of the transcripts, in 
the case of transcripts with more than one exon.
In the gff3 we generate, the exon phase for these first exons is not set.

According to the GFF3 specs, this is valid,
http://www.sequenceontology.org/gff3.shtml
The phase gets represented for the CDS features.

The only case I would envisage this information useful, is when there is 
no 5'UTR, and the start of the gene is missing, then the first exon can 
have a phase flag.
Anyway, we'll look into rationalising the way we represent phase info 
for exon features for the next release to avoid any confusion.

Let us know if the Ensembl Variant Effect Predictor (VEP) works for you.

Best regards,
Arnaud

On 01/05/2014 11:01, Will McLaren wrote:
> Hello Ksenia,
>
> Have you considered using the VEP? This is Ensembl's equivalent to 
> snpEff, and offers many features that snpEff does not. It also does 
> not depend on using these GFF files.
>
> http://www.ensembl.org/info/docs/tools/vep/script/index.html
>
> To set up for use with wheat, simply run the installer as follows:
>
> perl INSTALL.pl -u 
> ftp://ftp.ensemblgenomes.org/pub/plants/release-22/vep/ -s 
> triticum_aestivum -a ac
>
> Regarding your issue with the GFF3 file, we think this may be due to a 
> problem with exon phase in the file, though someone from the Ensembl 
> Genomes project may be able to comment further.
>
> HTH
>
> Will McLaren
> Ensembl Variation
>
>
> On 30 April 2014 23:08, Ksenia Krasileva <krasileva at ucdavis.edu 
> <mailto:krasileva at ucdavis.edu>> wrote:
>
>     Dear Ensembl team,
>
>     I am currently working with a GFF3 annotation of Triticum aestivum
>     (MIPS v2.1, Ensembl v22).
>     ftp://ftp.ensemblgenomes.org/pub/plants/release-22/gff3/triticum_aestivum/Triticum_aestivum.IWGSP1.22.gff3.gz
>
>     In our study, we are examining single nucleotide polymorphism
>     variants in wheat and their effect on protein coding regions using
>     SNPeff 3.5d (build 2014-03-05)
>
>     I see a warning message from SNPeff
>     "WARNING_TRANSCRIPT_MULTIPLE_STOP_CODONS" in 17,834 genes. Below
>     are 10 examples of genes that have this warning and 10 that do not.
>
>     Do you know what might be causing such warning? Or refer me to
>     someone who can help?
>
>     Thank you in advance.
>
>     Best wishes,
>
>     Ksenia
>
>     -- 
>     Ksenia Krasileva, PhD
>
>     Post Doctoral Scholar
>     USDA NIFA Fellowship
>     The Dubcovsky Lab
>
>     Department of Plant Sciences
>     University of California, Davis
>     124 Robbins Hall
>     Davis, CA 95616
>     krasileva at ucdavis.edu <mailto:krasileva at ucdavis.edu>
>
>     SNPeff data:
>
>     10 examples without warnings:
>     IWGSC_CSS_5BS_scaff_1487662     672     Kronos381 G       A      
>     40.0    Pass
>      seed_avail=Kronos381;EFF=SYNONYMOUS_CODING(LOW|SILENT|ggC/ggT|G280||Traes_5BS_38DA1CBC61.E1|||transcript:Traes_5BS_38DA1CBC61.2|1|1)
>     IWGSC_CSS_5BS_scaff_1537048     1688    Kronos244       C       T
>           40.0    Pass
>      seed_avail=Kronos244;EFF=NON_SYNONYMOUS_CODING(MODERATE|MISSENSE|gCt/gTt|A58V||CDS:Traes_5BS_B2CD001D1.2|||transcript:Traes_5BS_B2CD001D1.2|2|1)
>     IWGSC_CSS_5BS_scaff_1537048     1718    Kronos439       C       T
>           40.0    Pass
>      seed_avail=Kronos439;EFF=NON_SYNONYMOUS_CODING(MODERATE|MISSENSE|gCt/gTt|A68V||CDS:Traes_5BS_B2CD001D1.2|||transcript:Traes_5BS_B2CD001D1.2|2|1)
>     IWGSC_CSS_5BS_scaff_1646558     3136    Kronos684       G       A
>           40.0    Pass
>      seed_avail=Kronos684;EFF=SYNONYMOUS_CODING(LOW|SILENT|acC/acT|T115||Traes_5BS_E2116F66A.E1|||transcript:Traes_5BS_E2116F66A.1|1|1)
>     IWGSC_CSS_5BS_scaff_168385      2625    Kronos2646      G       A
>           40.0  Pass
>      seed_avail=Kronos2646;EFF=NON_SYNONYMOUS_CODING(MODERATE|MISSENSE|Gag/Aag|E18K||Traes_5BS_91BD3E004.E1|||transcript:Traes_5BS_91BD3E004.1|1|1)
>     IWGSC_CSS_5BS_scaff_168385      2666    Kronos390       C       T
>           40.0  Pass
>      seed_avail=Kronos390;EFF=SYNONYMOUS_CODING(LOW|SILENT|gcC/gcT|A31||Traes_5BS_91BD3E004.E1|||transcript:Traes_5BS_91BD3E004.1|1|1)
>     IWGSC_CSS_5BS_scaff_168385      2714    Kronos166       G       A
>           40.0  Pass
>      seed_avail=Kronos166;EFF=SYNONYMOUS_CODING(LOW|SILENT|gaG/gaA|E47||Traes_5BS_91BD3E004.E1|||transcript:Traes_5BS_91BD3E004.1|1|1)
>     IWGSC_CSS_5BS_scaff_1712494     1012    Kronos447       C       T
>           40.0    Pass
>      seed_avail=Kronos447;EFF=NON_SYNONYMOUS_CODING(MODERATE|MISSENSE|Ggg/Agg|G681R||Traes_5BS_6004CA039.E3|||transcript:Traes_5BS_6004CA039.1|1|1)
>     IWGSC_CSS_5BS_scaff_1712494     2037    Kronos2214      G       A
>           40.0    Pass
>      seed_avail=Kronos2214;EFF=NON_SYNONYMOUS_CODING(MODERATE|MISSENSE|tCc/tTc|S339F||Traes_5BS_6004CA039.E3|||transcript:Traes_5BS_6004CA039.1|1|1)
>     IWGSC_CSS_5BS_scaff_1712494     2086    Kronos199       G       A
>           40.0    Pass
>      seed_avail=Kronos199;EFF=NON_SYNONYMOUS_CODING(MODERATE|MISSENSE|Cct/Tct|P323S||Traes_5BS_6004CA039.E3|||transcript:Traes_5BS_6004CA039.1|1|1)
>
>
>     10 Examples with warnings:
>     IWGSC_CSS_1BS_scaff_1134305     892 Kronos2092      G       A    
>       40.0    Pass
>      seed_avail=Kronos2092;EFF=NON_SYNONYMOUS_CODING(MODERATE|MISSENSE|Gtt/Att|V123I||Traes_1BS_9AD3B3BE2.E22|||transcript:Traes_1BS_9AD3B3BE2.1|4|1|WARNING_TRANSCRIPT_MULTIPLE_STOP_CODONS),SPLICE_SITE_REGION(LOW|||||Traes_1BS_9AD3B3BE2.E22|||transcript:Traes_1BS_9AD3B3BE2.1|4|1|WARNING_TRANSCRIPT_MULTIPLE_STOP_CODONS)
>     IWGSC_CSS_1BS_scaff_1134305     5997  Kronos2711      C       T  
>         40.0    Pass
>      seed_avail=Kronos2711;EFF=NON_SYNONYMOUS_CODING(MODERATE|MISSENSE|gCt/gTt|A806V||Traes_1BS_9AD3B3BE2.E22|||transcript:Traes_1BS_9AD3B3BE2.1|18|1|WARNING_TRANSCRIPT_MULTIPLE_STOP_CODONS)
>     IWGSC_CSS_1BS_scaff_1260855     1972    Kronos375       G       A
>           40.0    Pass
>      seed_avail=Kronos375;EFF=SYNONYMOUS_CODING(LOW|SILENT|cgG/cgA|R266||Traes_1BS_91DEF138E.E1|||transcript:Traes_1BS_91DEF138E.1|6|1|WARNING_TRANSCRIPT_MULTIPLE_STOP_CODONS)
>     IWGSC_CSS_1BS_scaff_1280056     767     Kronos166       C       T
>           40.0    Pass
>      seed_avail=Kronos166;EFF=SYNONYMOUS_CODING(LOW|SILENT|aaC/aaT|N93||Traes_1BS_984B78A41.E3|||transcript:Traes_1BS_984B78A41.1|3|1|WARNING_TRANSCRIPT_MULTIPLE_STOP_CODONS)
>     IWGSC_CSS_1BS_scaff_1420918     702     Kronos209       G       A
>           40.0    Pass
>      seed_avail=Kronos209;EFF=NON_SYNONYMOUS_CODING(MODERATE|MISSENSE|gGg/gAg|G138E||Traes_1BS_4073CE3DB.E34|||transcript:Traes_1BS_4073CE3DB.1|1|1|WARNING_TRANSCRIPT_MULTIPLE_STOP_CODONS)
>     IWGSC_CSS_1BS_scaff_1420918     871     Kronos166       G       A
>           40.0    Pass
>      seed_avail=Kronos166;EFF=SYNONYMOUS_CODING(LOW|SILENT|agG/agA|R194||Traes_1BS_4073CE3DB.E34|||transcript:Traes_1BS_4073CE3DB.1|1|1|WARNING_TRANSCRIPT_MULTIPLE_STOP_CODONS)
>     IWGSC_CSS_1BS_scaff_1420918     912     Kronos209       G       A
>           40.0    Pass
>      seed_avail=Kronos209;EFF=NON_SYNONYMOUS_CODING(MODERATE|MISSENSE|gGa/gAa|G208E||Traes_1BS_4073CE3DB.E34|||transcript:Traes_1BS_4073CE3DB.1|1|1|WARNING_TRANSCRIPT_MULTIPLE_STOP_CODONS)
>     IWGSC_CSS_1BS_scaff_1420918     5050  Kronos2553      G       A  
>         40.0    Pass
>      seed_avail=Kronos2553;EFF=NON_SYNONYMOUS_CODING(MODERATE|MISSENSE|Gta/Ata|V692I||Traes_1BS_4073CE3DB.E34|||transcript:Traes_1BS_4073CE3DB.1|11|1|WARNING_TRANSCRIPT_MULTIPLE_STOP_CODONS),UPSTREAM(MODIFIER||1659|||Traes_1BS_587F929ED.E1|||transcript:Traes_1BS_587F929ED.1||1)
>     IWGSC_CSS_1BS_scaff_1912213     607     Kronos910       G       A
>           40.0    Pass
>      seed_avail=Kronos910;EFF=NON_SYNONYMOUS_CODING(MODERATE|MISSENSE|aGt/aAt|S59N||Traes_1BS_0039045F7.E21|||transcript:Traes_1BS_0039045F7.2|1|1|WARNING_TRANSCRIPT_MULTIPLE_STOP_CODONS)
>     IWGSC_CSS_1BS_scaff_1975885     313     Kronos684       G       A
>           40.0    Pass
>      seed_avail=Kronos684;EFF=NON_SYNONYMOUS_CODING(MODERATE|MISSENSE|Gga/Aga|G25R||Traes_1BS_CC1B13C13.E32|||transcript:Traes_1BS_CC1B13C13.1|1|1|WARNING_TRANSCRIPT_MULTIPLE_STOP_CODONS)
>
>     -- 
>     Ksenia Krasileva, PhD
>
>     Post Doctoral Scholar
>     USDA NIFA Fellowship
>     The Dubcovsky Lab
>
>     Department of Plant Sciences
>     University of California, Davis
>     124 Robbins Hall
>     Davis, CA 95616
>     krasileva at ucdavis.edu <mailto:krasileva at ucdavis.edu>
>
>     _______________________________________________
>     Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>     Posting guidelines and subscribe/unsubscribe info:
>     http://lists.ensembl.org/mailman/listinfo/dev
>     Ensembl Blog: http://www.ensembl.info/
>
>
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20140501/ce6c56fc/attachment.html>


More information about the Dev mailing list