[ensembl-dev] Variant Effect Predictor --gene option not working

Will McLaren wm2 at ebi.ac.uk
Wed Jan 9 14:25:55 GMT 2013


Hello again,

1) I have fixed the download for 2.4 - the link should work now.

2) Most of the code that the VEP uses is actually contained in the API; in
version 69 the code that adds the gene column is there (as it is in
previous versions, though not it seems back as far as 66!). Hence in some
cases it does not make a difference which version of the script file you
use; largely the script just parses your command line parameters and does
some initial configuration. The bulk of the operational code resides in
your API checkout.

However, we would always encourage you to use the corresponding
script/API/DB/cache versions to avoid errors.

The cache contains sufficient information such that the VEP can work out
(on the fly) the consequences of any variant you give it - broadly speaking
this information consists of the coordinates, structure and sequence of
every transcript in Ensembl. The cache also contains the equivalent
information for regulatory features, as well as the location, alleles and
MAF of every short variant from dbSNP.

The contents, and limitations of the cache, are described in the
documentation:

http://www.ensembl.org/info/docs/variation/vep/vep_script.html#limitations

To create HGVS notations you do need an additional FASTA file containing
the whole genome reference sequence. With this downloaded, however, this
leaves only a couple of features that would require use of the database;
these are listed in the link above.

Hope this helps!

Will McLaren
Ensembl Variation




On 9 January 2013 13:14, Sebastian Ginzel <sginze2s at inf.h-brs.de> wrote:

>  Hello Will,
>
> thanks for your quick response.
>
> I would want to stay on version 66, because our NGS pipeline is built
> around this release and I would like to keep all our results as consistent
> as possible.
>
> But now back to my problem: I tried what you said and set up two
> installations for VEP with ensembl 66 and for ensembl 69 (according to the
> link you provided), both times using the install script to also setup the
> corresponding API and cache version. First of all I thought I dowloaded VEP
> version 2.4 as stated by the link description on the website and the README
> file inside the tar.gz. But I looked into the source code and help screen
> and the VEP script is actually version 2.6. This is kind of confusing,
> because if I understood you correctly each VEP version should best be used
> with a certain API version.
>
> I only saw ensembl gene IDs when I used Ensembl version 69 API, but never
> with Ensembl API version 66. The VEP version did not matter and the --gene
> option wasn't available in any of the two VEP versions.
>
> I also noticed that, when using version 66 (with VEP 2.7 and VEP2.6) I get
> an error message that says
>     Can't use string ("21    26960070    rs116645811    G    A    .
> .    "...) as a SCALAR ref while "strict refs" in use at
> variant_effect_predictor.pl line 1550
>
> When I change this line in the source from
>     $output = $$line;
> to
>     $output = $line;
> it works, but then it doesn't work for API version 69 anymore. Maybe this
> is a bit off topic though.
>
> So I think now I have two questions regarding the Gene Ids:
>
> 1) How can I download the actual VEP 2.4 script to try out what you
> suggested? The link on the official website only lets me download VEP2.6.
> 2) To me it seems that the API version is the cause for my problem,
> because I get Gene IDs when I use ensembl 69 with any of the two VEP
> versions. Is there any explaination for this?
>
> And a third question about the cache: What is included in the cache and
> where can I find information on how to use the local database to add custom
> annotations? I would really like to use the local cache only, but I always
> used our local database because I figured that you couldn't have possibly
> stored all information for all possible variants in a single 2GB cache
> file, or could you?
>
>
> Best wishes & sorry for all the text and all the questions :)
> Sebastian
>
>
> On 08.01.2013 14:17, Will McLaren wrote:
>
> Hello Sebastien,
>
>  Thanks for the detailed report.
>
>  I think the problem may be caused by using an older version of the API.
> The latest version of the script (2.7) should be used with version 69 of
> the Ensembl API.
>
>  If you need to use version 66 of the API (for example if you are unable
> to upgrade the database you are using), you should use the appropriate
> script version with this. You can see which versions go together here:
>
>  http://www.ensembl.org/info/docs/variation/vep/vep_script.html#download
>
>  As an aside, you may find that the latest version of the VEP gives you
> everything you need in the cache files available from us without having to
> use a local database. However, of course if you are using a local database
> for custom annotations, you should continue to do so.
>
>  Regards
>
>  Will McLaren
> Ensembl Variation
>
>
> On 8 January 2013 11:59, Sebastian Ginzel <sginze2s at inf.h-brs.de> wrote:
>
>> Dear Ensembl-Developer Team,
>>
>> I want to use the Variant Effect Predictor standalone script to annotate
>> my VCF file for further processing and I need the variants to have Ensembl
>> Gene IDs.
>>
>> Unfortunatly the --gene option is not working and results in this error
>> message:
>>
>> Unknown option: gene
>> ERROR: Failed to parse command-line flags
>>
>> Without the --gene option everything runs through perfectly, but no
>> Ensembl Gene IDs show up in the output although the documentation avaiable
>> at http://www.ensembl.org/info/docs/variation/vep/vep_script.html#outputsuggests that the output of ENSG IDs is forced when using the --cache
>> option (which I also use). A quick check of the source code of the
>> variant_effect_predictor.pl script showed me, that the --gene option
>> seems not to be implemented anymore.
>>
>> I saw that there was somebody mentioning the removal of the --gene option
>> in the mailing list archives following a thread that started on 5th
>> December 2012 09:12:48. But it doesn't mention anything like my problem.
>>
>> That leaves me with two questions:
>>
>> 1) What happend to the --gene option and can anyone reproduce this?
>> 2) How can I force the population of the Ensembl Gene ID column when
>> --cache is also not working?
>>
>>
>> Best wishes,
>> Sebastian Ginzel
>>
>> PS: Here is what I did to setup VEP on my Ubuntu 12.04 system with perl
>> v5.14.2.
>>
>> I downloaded and setup the latest VEP version 2.7 (
>> http://cvs.sanger.ac.uk/cgi-bin/viewvc.cgi/ensembl-tools/scripts/variant_effect_predictor.tar.gz?view=tar&root=ensembl&pathrev=branch-ensembl-69- MD5Sum ab780dcb0267e5872f85ebe2ff4837f5)
>>
>> "perl variant_effect_predictor.pls --help" shows me that I actually use
>> the 2.7 version.
>>
>> I also downloaded some plugins through GIT using:
>> git clone "https://github.com/ensembl-variation/VEP_plugins"
>>
>> I used this command line to call the script:
>>
>> perl /home/sginze2s/vep/lib/vep/bin/variant_effect_predictor/
>> variant_effect_predictor.pl -i /home/sginze2s/vep/lib/vep/sample1.vcf -o
>> /tmp/bla.vcf --cache --dir /home/sginze2s/vep/lib/vep/bin/cache --prefetch
>> --no_adaptor_cache --write_cache --strip  --everything --gmaf --xref_refseq
>> --failed 1  --fork 4 --vcf --format vcf --no_progress --check_existing
>> --check_svs --plugin
>> Condel,/home/sginze2s/vep/lib/vep/bin/cache/Plugins/config/Condel/config
>> --plugin Blosum62 --plugin Downstream --species homo_sapiens
>> --db_version=66 --host bio.inf.h-brs.de --user ensembl --password
>> ******* --port 13306 --force_overwrite --quiet --gene
>>
>> I use Ensembl API version 66 and use the PERL5LIB variable to link to it.
>>
>> PERL5LIB=/lib/ensembl_66/ensembl-functgenomics/modules:/lib/ensembl_66/ensembl-variation/modules:/lib/ensembl_66/ensembl-compara/modules:/lib/ensembl_66/ensembl/modules:/lib/bioperl-1.2.3:/lib/bioperl-1.5.2_102_Matrix
>>
>>
>>
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20130109/babc5949/attachment.html>


More information about the Dev mailing list