[ensembl-dev] ./vep offline problem
Sabrina Legoueix-Rodriguez
sabrina.legoueix at inra.fr
Mon Jun 12 15:49:36 BST 2017
Hi Will,
Thanks for your answer.
How important is the biotype for the predictions ?
Best regrads,
Sabrina
Le 22/05/2017 18:02, Will McLaren a écrit :
> Hi Sabrina,
>
> There's a few issues with your GTF; if you correct them then it should
> work.
>
> 1) IDs should not be shared by transcripts and genes. In your example,
> I fixed this by prefixing the gene ID with "g_" and the transcript ID
> with "t_"
>
> 2) Transcript entries need a valid biotype; typically this will be
> "protein_coding" (see
> http://www.ensembl.org/info/docs/tools/vep/script/vep_cache.html#gff)
>
> 3) The phase field must be correctly set for CDS entries.
>
> These points also apply if you use a GFF format file.
>
> Hope that helps
>
> Will McLaren
> Ensembl Variation
>
> On 22 May 2017 at 12:36, Sabrina Legoueix-Rodriguez
> <sabrina.legoueix at inra.fr <mailto:sabrina.legoueix at inra.fr>> wrote:
>
> Dear all,
>
> I have installed on my machine your recent vep API locally to use
> a home made genome in order to get SNPs annotations.
>
> I used the instructions on these pages:
> http://www.ensembl.org/info/docs/tools/vep/script/vep_cache.html#offline
> <http://www.ensembl.org/info/docs/tools/vep/script/vep_cache.html#offline>
> http://www.ensembl.org/info/docs/tools/vep/script/index.html
> <http://www.ensembl.org/info/docs/tools/vep/script/index.html>
>
> My inputs are:
> -> a home made reference genome in fasta file
> -> a .VCF file with SNPs list on that genome
> -> a .GTF file with genome annotations
>
> My goal is to use vep to generate a .vep file with functionnal
> annotations of my SNPs.
>
> For instance:
>
> my gtf is:
> tig00000004_pilon_pilon Pacbio gene 231183 234374 .
> + . gene_id "A";
> tig00000004_pilon_pilon Pacbio transcript 231183
> 234374 . + . gene_id "A";transcript_id "A";
> tig00000004_pilon_pilon Pacbio CDS 231183 234374 .
> + . gene_id "A";transcript_id "A";
> tig00000004_pilon_pilon Pacbio exon 231183 234374 .
> + . gene_id "A";transcript_id "A";
>
> ( I also tried with a .gff)
>
> my vcf is:
> ##...
> #CHROM POS ID REF ALT QUAL FILTER INFO
> FORMAT A_ATTACTCG
> tig00000004_pilon_pilon 232205 . G A 9881.15 .
> AC=8;AF=0.800;AN=10;DP=245;FS=0.000;MLEAC=8;MLEAF=0.800;MQ=60.05;QD=25.82;SOR=0.983
> GT:AD:DP:GQ:PL 0:9,0:9:99:0,247
>
> => this snp should be found in the gene "A"
>
> To prepare the gtf (or also .gff), I used:
> grep -v "^#" test.gtf | sort -k1,1 -k4,4n -k5,5n | bgzip -c >
> test.gtf.gz
> tabix -p gtf test.gtf.gz
>
> my command line is:
> ./vep -i test.vcf -gtf test.gtf.gz -fasta ref.fasta --force_overwrite
> or
> ./vep -i test.vcf -gff test.gff.gz -fasta ref.fasta --force_overwrite
>
> The result file is:
> #Uploaded_variation Location Allele Gene Feature
> Feature_type Consequence cDNA_position CDS_position
> Protein_position Amino_acids Codons
> Existing_variation Extra
> . tig00000004_pilon_pilon:232205 A - - -
> *intergenic_variant* - - - - - -
> IMPACT=MODIFIER
> variant_effect_output.txt (END)
>
>
> It does not work, it retreives only integenic variants which is
> wrong as I have some SNPs in genes...
>
> When I try the tools on data that I used to work on using
> gtf2vep.pl <http://gtf2vep.pl> a few years ago, it does not work
> either....
>
> Could you please help me and tell me if I am doing something wrong?
>
> Thank you in advance.
>
> Best regards,
>
> Sabrina
> --
>
> Sabrina
>
>
>
> **Sabrina LEGOUEIX RODRIGUEZ**
> Responsable Plateau Bioinformatique
>
> Tél. : +33 (0) 5 61 28 57 92 <tel:+33%205%2061%2028%2057%2092>
> sabrina.legoueix at toulouse.inra.fr <mailto:[MAIL]>
> www.toulouse-white-biotechnology.com
> <http://www.toulouse-white-biotechnology.com>
>
> LinkedIn <https://www.linkedin.com/company/2757525h> Twitter
> <https://twitter.com/TWB_Biotech>
>
> TWB - Parc technologique du canal • Bâtiment NAPA CENTER B • 3,
> rue Ariane • 31520 Ramonville Saint-Agne
>
>
> Ce message et ses pièces jointes sont strictement personnels. Ils
> peuvent contenir des informations confidentielles. Si vous avez
> reçu ce message par erreur, merci d'en avertir l'expéditeur et de
> détruire le message et les documents joints. Toute utilisation des
> informations reçues par erreur est interdite. This message and the
> attachments are strictly personal. They may contain confidential
> information. If you have received this message in error, please
> notify the sender and delete the message and the attachments. Any
> use of this communication received in error is prohibited.
>
>
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> <http://lists.ensembl.org/mailman/listinfo/dev>
> Ensembl Blog: http://www.ensembl.info/
>
>
>
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
--
Signature e-mail TWB
Sabrina
**Sabrina LEGOUEIX RODRIGUEZ**
Responsable Plateau Bioinformatique
Tél. : +33 (0) 5 61 28 57 92
sabrina.legoueix at toulouse.inra.fr <mailto:[MAIL]>
www.toulouse-white-biotechnology.com
<http://www.toulouse-white-biotechnology.com/>
LinkedIn <https://www.linkedin.com/company/2757525h> Twitter
<https://twitter.com/TWB_Biotech>
TWB - Parc technologique du canal • Bâtiment NAPA CENTER B • 3, rue
Ariane • 31520 Ramonville Saint-Agne
Ce message et ses pièces jointes sont strictement personnels. Ils
peuvent contenir des informations confidentielles. Si vous avez reçu ce
message par erreur, merci d'en avertir l'expéditeur et de détruire le
message et les documents joints. Toute utilisation des informations
reçues par erreur est interdite. This message and the attachments are
strictly personal. They may contain confidential information. If you
have received this message in error, please notify the sender and delete
the message and the attachments. Any use of this communication received
in error is prohibited.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20170612/13c8b723/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 7561 bytes
Desc: not available
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20170612/13c8b723/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 1120 bytes
Desc: not available
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20170612/13c8b723/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 1245 bytes
Desc: not available
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20170612/13c8b723/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: twb-logo.png
Type: image/png
Size: 7561 bytes
Desc: not available
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20170612/13c8b723/attachment-0003.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: linkedin.png
Type: image/png
Size: 1120 bytes
Desc: not available
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20170612/13c8b723/attachment-0004.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: twitter.png
Type: image/png
Size: 1245 bytes
Desc: not available
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20170612/13c8b723/attachment-0005.png>
More information about the Dev
mailing list