[ensembl-dev] ./vep offline problem

Sabrina Legoueix-Rodriguez sabrina.legoueix at inra.fr
Mon Jun 12 15:49:36 BST 2017


Hi Will,

Thanks for your answer.
How important is the biotype for the predictions ?

Best regrads,

Sabrina

Le 22/05/2017 18:02, Will McLaren a écrit :
> Hi Sabrina,
>
> There's a few issues with your GTF; if you correct them then it should 
> work.
>
> 1) IDs should not be shared by transcripts and genes. In your example, 
> I fixed this by prefixing the gene ID with "g_" and the transcript ID 
> with "t_"
>
> 2) Transcript entries need a valid biotype; typically this will be 
> "protein_coding" (see 
> http://www.ensembl.org/info/docs/tools/vep/script/vep_cache.html#gff)
>
> 3) The phase field must be correctly set for CDS entries.
>
> These points also apply if you use a GFF format file.
>
> Hope that helps
>
> Will McLaren
> Ensembl Variation
>
> On 22 May 2017 at 12:36, Sabrina Legoueix-Rodriguez 
> <sabrina.legoueix at inra.fr <mailto:sabrina.legoueix at inra.fr>> wrote:
>
>     Dear all,
>
>     I have installed on my machine your recent vep API locally to use
>     a home made genome in order to get SNPs annotations.
>
>     I used the instructions on these pages:
>     http://www.ensembl.org/info/docs/tools/vep/script/vep_cache.html#offline
>     <http://www.ensembl.org/info/docs/tools/vep/script/vep_cache.html#offline>
>     http://www.ensembl.org/info/docs/tools/vep/script/index.html
>     <http://www.ensembl.org/info/docs/tools/vep/script/index.html>
>
>     My inputs are:
>     -> a home made reference genome in fasta file
>     -> a .VCF file with SNPs list on that genome
>     -> a .GTF file with genome annotations
>
>     My goal is to use vep to generate a .vep file with functionnal
>     annotations of my SNPs.
>
>     For instance:
>
>     my gtf is:
>     tig00000004_pilon_pilon    Pacbio    gene    231183 234374    .   
>     +    .    gene_id "A";
>     tig00000004_pilon_pilon    Pacbio    transcript 231183   
>     234374    .    +    .    gene_id "A";transcript_id "A";
>     tig00000004_pilon_pilon    Pacbio    CDS    231183 234374    .   
>     +    .    gene_id "A";transcript_id "A";
>     tig00000004_pilon_pilon    Pacbio    exon    231183 234374    .   
>     +    .    gene_id "A";transcript_id "A";
>
>     ( I also tried with a .gff)
>
>     my vcf is:
>     ##...
>     #CHROM  POS     ID      REF     ALT     QUAL    FILTER INFO   
>     FORMAT  A_ATTACTCG
>     tig00000004_pilon_pilon 232205  .       G       A 9881.15 .      
>     AC=8;AF=0.800;AN=10;DP=245;FS=0.000;MLEAC=8;MLEAF=0.800;MQ=60.05;QD=25.82;SOR=0.983
>     GT:AD:DP:GQ:PL  0:9,0:9:99:0,247
>
>     => this snp should be found in the gene "A"
>
>     To prepare the gtf (or also .gff), I used:
>     grep -v "^#" test.gtf  | sort -k1,1 -k4,4n -k5,5n | bgzip -c >
>     test.gtf.gz
>     tabix -p gtf test.gtf.gz
>
>     my command line is:
>     ./vep -i test.vcf -gtf test.gtf.gz -fasta ref.fasta --force_overwrite
>     or
>     ./vep -i test.vcf -gff test.gff.gz -fasta ref.fasta --force_overwrite
>
>     The result file is:
>     #Uploaded_variation     Location        Allele  Gene Feature
>     Feature_type    Consequence     cDNA_position CDS_position   
>     Protein_position        Amino_acids Codons 
>     Existing_variation      Extra
>     .       tig00000004_pilon_pilon:232205  A -       -       -
>     *intergenic_variant* -       -       -       -       -       -
>     IMPACT=MODIFIER
>     variant_effect_output.txt (END)
>
>
>     It does not work, it retreives only integenic variants which is
>     wrong as I have some SNPs in genes...
>
>     When I try the tools on data that I used to work on using
>     gtf2vep.pl <http://gtf2vep.pl> a few years ago, it does not work
>     either....
>
>     Could you please help me and tell me if I am doing something wrong?
>
>     Thank you in advance.
>
>     Best regards,
>
>     Sabrina
>     -- 
>
>     Sabrina
>
>     	
>
>     **Sabrina LEGOUEIX RODRIGUEZ**
>     Responsable Plateau Bioinformatique
>
>     Tél. : +33 (0) 5 61 28 57 92 <tel:+33%205%2061%2028%2057%2092>
>     sabrina.legoueix at toulouse.inra.fr <mailto:[MAIL]>
>     www.toulouse-white-biotechnology.com
>     <http://www.toulouse-white-biotechnology.com>
>
>     LinkedIn <https://www.linkedin.com/company/2757525h> Twitter
>     <https://twitter.com/TWB_Biotech>
>
>     TWB - Parc technologique du canal • Bâtiment NAPA CENTER B • 3,
>     rue Ariane • 31520 Ramonville Saint-Agne
>
>
>     Ce message et ses pièces jointes sont strictement personnels. Ils
>     peuvent contenir des informations confidentielles. Si vous avez
>     reçu ce message par erreur, merci d'en avertir l'expéditeur et de
>     détruire le message et les documents joints. Toute utilisation des
>     informations reçues par erreur est interdite. This message and the
>     attachments are strictly personal. They may contain confidential
>     information. If you have received this message in error, please
>     notify the sender and delete the message and the attachments. Any
>     use of this communication received in error is prohibited.
>
>
>
>     _______________________________________________
>     Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>     Posting guidelines and subscribe/unsubscribe info:
>     http://lists.ensembl.org/mailman/listinfo/dev
>     <http://lists.ensembl.org/mailman/listinfo/dev>
>     Ensembl Blog: http://www.ensembl.info/
>
>
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-- 
Signature e-mail TWB
Sabrina

	

**Sabrina LEGOUEIX RODRIGUEZ**
Responsable Plateau Bioinformatique

Tél. : +33 (0) 5 61 28 57 92
sabrina.legoueix at toulouse.inra.fr <mailto:[MAIL]>
www.toulouse-white-biotechnology.com 
<http://www.toulouse-white-biotechnology.com/>

LinkedIn <https://www.linkedin.com/company/2757525h> Twitter 
<https://twitter.com/TWB_Biotech>

TWB - Parc technologique du canal • Bâtiment NAPA CENTER B • 3, rue 
Ariane • 31520 Ramonville Saint-Agne


Ce message et ses pièces jointes sont strictement personnels. Ils 
peuvent contenir des informations confidentielles. Si vous avez reçu ce 
message par erreur, merci d'en avertir l'expéditeur et de détruire le 
message et les documents joints. Toute utilisation des informations 
reçues par erreur est interdite. This message and the attachments are 
strictly personal. They may contain confidential information. If you 
have received this message in error, please notify the sender and delete 
the message and the attachments. Any use of this communication received 
in error is prohibited.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20170612/13c8b723/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 7561 bytes
Desc: not available
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20170612/13c8b723/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 1120 bytes
Desc: not available
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20170612/13c8b723/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 1245 bytes
Desc: not available
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20170612/13c8b723/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: twb-logo.png
Type: image/png
Size: 7561 bytes
Desc: not available
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20170612/13c8b723/attachment-0003.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: linkedin.png
Type: image/png
Size: 1120 bytes
Desc: not available
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20170612/13c8b723/attachment-0004.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: twitter.png
Type: image/png
Size: 1245 bytes
Desc: not available
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20170612/13c8b723/attachment-0005.png>


More information about the Dev mailing list