[ensembl-dev] VEP installation problems - unable to install GRCh37 caches
Paul Hatton
P.S.HATTON at bham.ac.uk
Tue Apr 26 17:09:32 BST 2016
Cyriac,
I have to send abject apologies - I had omitted the
cat $VEP_DATA/*_vep_84_GRC{h37,h38,m38}.tar.gz | tar -izxf - -C $VEP_DATA
command, and all is fine now. I had been looking at the instructions for so long that my eyes glazed over. It worked fine with the combined
convert_cache.pl --species homo_sapiens,mus_musculus --version 84_GRCh37,84_GRCh38,84_GRCm38 --dir $VEP_DATA
command, and the example program on the gist ran absolutely fine.
Many thanks for pointing me to the gist and apologies again for my error at the final hurdle.
Regards
--
Paul Hatton
High Performance Computing and Visualisation Specialist
IT Services, The University of Birmingham
Ph: 0121-414-3994 Mob: 07785-977340 Skype: P.S.Hatton
[Service Manager, Birmingham Environment for Academic Research]
[ http://www.birmingham.ac.uk/bear ]
[Also Technical Director, IBM Visual and Spatial Technology Centre]
From: Cyriac Kandoth [mailto:kandoth at cbio.mskcc.org]
Sent: 25 April 2016 23:22
To: Paul Hatton
Cc: Ensembl developers list
Subject: Re: [ensembl-dev] VEP installation problems - unable to install GRCh37 caches
I'm unable to reproduce that error. But looking at the code, it appears to happen if you're missing the "info.txt" file in a cache folder. Check for them like this:
$ ll -h $VEP_DATA/{homo_sapiens,mus_musculus}/*/info.txt
-rw-r--r-- 1 kandoth pwgmgr 786 Feb 26 16:03 /opt/common/CentOS_6-dev/vep/v84/homo_sapiens/84_GRCh37/info.txt
-rw-r--r-- 1 kandoth pwgmgr 1.4K Feb 29 12:31 /opt/common/CentOS_6-dev/vep/v84/homo_sapiens/84_GRCh38/info.txt
-rw-r--r-- 1 kandoth pwgmgr 533 Feb 23 10:14 /opt/common/CentOS_6-dev/vep/v84/mus_musculus/84_GRCm38/info.txt
If your info.txt files are missing, then one of the steps before "convert_cache.pl<http://convert_cache.pl>" was skipped. E.g. make sure you didn't forget to untar those cache tarballs after rsync-ing them.
~Cyriac
On Mon, Apr 25, 2016 at 1:01 PM, Paul Hatton <P.S.HATTON at bham.ac.uk<mailto:P.S.HATTON at bham.ac.uk>> wrote:
Afraid not:
[vep 17:57] $ perl ./convert_cache.pl<http://convert_cache.pl> --species mus_musculus --version 84_GRCm38 --dir $VEP_DATA
2016-04-25 17:58:10 - Processing mus_musculus
2016-04-25 17:58:10 - Processing version 84_GRCm38
Can't use an undefined value as an ARRAY reference at ./convert_cache.pl<http://convert_cache.pl> line 188.
Not sure if this helps:
[vep 18:00] $ ll -h $VEP_DATA
total 13G
-rw-r--r-- 1 appmaint appmaint 1.7G Apr 24 17:32 ExAC.r0.3.sites.minus_somatic.vcf.gz
-rw-r--r-- 1 appmaint appmaint 800K Apr 24 17:35 ExAC.r0.3.sites.minus_somatic.vcf.gz.tbi
drwxr-xr-x 4 appmaint appmaint 512 Apr 24 17:03 homo_sapiens
-r-x------ 1 appmaint appmaint 4.8G Apr 24 16:42 homo_sapiens_vep_84_GRCh37.tar.gz
-r-x------ 1 appmaint appmaint 4.8G Apr 24 16:42 homo_sapiens_vep_84_GRCh38.tar.gz
drwxr-xr-x 3 appmaint appmaint 512 Apr 24 17:17 mus_musculus
-r-x------ 1 appmaint appmaint 1.4G Apr 24 16:42 mus_musculus_vep_84_GRCm38.tar.gz
drwxr-xr-x 2 appmaint appmaint 512 Apr 24 16:59 Plugins
Regards
--
Paul Hatton
High Performance Computing and Visualisation Specialist
IT Services, The University of Birmingham
Ph: 0121-414-3994 Mob:07785 977340 Skype:P.S.Hatton
[Service Manager, Birmingham Environment for Academic Research]
[Also Technical Director, IBM Visual and Spatial Technology Centre]
From: Cyriac Kandoth [mailto:kandoth at cbio.mskcc.org<mailto:kandoth at cbio.mskcc.org>]
Sent: 25 April 2016 16:34
To: Paul Hatton
Cc: Ensembl developers list
Subject: Re: [ensembl-dev] VEP installation problems - unable to install GRCh37 caches
Try separating out the mouse caches from the human caches...
perl convert_cache.pl<http://convert_cache.pl> --species mus_musculus --version 84_GRCm38 --dir $VEP_DATA
perl convert_cache.pl<http://convert_cache.pl> --species homo_sapiens --version 84_GRCh37,84_GRCh38 --dir $VEP_DATA
If that works, lemme know, I'll update the gist.
~Cyriac
On Sun, Apr 24, 2016 at 12:40 PM, Paul Hatton <P.S.HATTON at bham.ac.uk<mailto:P.S.HATTON at bham.ac.uk>> wrote:
When I follow the gist it is fine apart from:
[variant_effect_predictor 17:28] $ convert_cache.pl<http://convert_cache.pl> --species homo_sapiens,mus_musculus --version 84_GRCh37,84_GRCh38,84_GRCm38 --dir $VEP_DATA
2016-04-24 17:29:18 - Processing homo_sapiens
2016-04-24 17:29:18 - Processing version 84_GRCh38
Can't use an undefined value as an ARRAY reference at ./convert_cache.pl<http://convert_cache.pl> line 188.
Does this look at all familiar? Maybe an error on my part but I have followed the gist closely.
This is using perl 5.20, which is the version that has been recommended to me for vcf2maf, in case that is relevant:
[variant_effect_predictor 17:35] $ which perl
/gpfs/apps/perl/v5.20.0_gcc-v4.7.2/bin/perl
Many thanks (again)
--
Paul Hatton
High Performance Computing and Visualisation Specialist
IT Services, The University of Birmingham
Ph: 0121-414-3994 Mob:07785 977340 Skype:P.S.Hatton
[Service Manager, Birmingham Environment for Academic Research]
[Also Technical Director, IBM Visual and Spatial Technology Centre]
From: Cyriac Kandoth [mailto:kandoth at cbio.mskcc.org<mailto:kandoth at cbio.mskcc.org>]
Sent: 19 April 2016 21:59
To: Ensembl developers list
Cc: Paul Hatton
Subject: Re: [ensembl-dev] VEP installation problems - unable to install GRCh37 caches
I forgot to mention - the order of instructions in that gist is specifically addressing the error you reported - "For technical reasons this installer is unable to install GRCh37 caches alongside others; please install them separately"
Also, I'd recommend against installing all plugins, it can get messy. Only the ExAC plugin is currently used by vcf2maf. In the future, it may use more. Here is a list of all the available plugins:
https://github.com/Ensembl/VEP_plugins
~C
On Tue, Apr 19, 2016 at 4:18 PM, Cyriac Kandoth <kandoth at cbio.mskcc.org<mailto:kandoth at cbio.mskcc.org>> wrote:
Hi Paul,
It is appropriate to post such a qn to dev at ensembl. If you have an issue specifically with vcf2maf, we (mskcc) can help you at https://github.com/mskcc/vcf2maf/issues
The latest readme for vcf2maf includes instructions for installing VEP v83 - https://github.com/mskcc/vcf2maf - I will remove these instructions in the next few days in favor of gists. Here's a gist for installing VEP v84 with an offline cache for GRCh37 - https://gist.github.com/ckandoth/57d189f018b448774704d3b2191720a6
~Cyriac
On Tue, Apr 19, 2016 at 4:07 AM, Paul Hatton <P.S.HATTON at bham.ac.uk<mailto:P.S.HATTON at bham.ac.uk>> wrote:
Apologies if this is the wrong list to post this to, but a search for this problem led me to this list and I can't find any mention of it in the archives (which sort-of suggests that I'm on the wrong list).
I look after the applications base on our Linux-based HPC service at the University of Birmingham (UK) and we have recently established a new Centre for Computational Biology and hence I am asked to install much specialist software such as VEP. I have a great deal of experience in installing applications on a Linux HPC service but limited experience of these specialist applications, so apologies again if this is a naive question posted to the wrong list .... feel free to point me elsewhere if more appropriate ......
Anyhow, I have been asked to get vcf2maf running, which depends on VEP. I have been unable to get a clean installation of VEP 82, 83 or 84 which I think is having knock-on problems to users running vcf2maf, and so I'd like to get VEP installed cleanly first. Whilst VEP itself build fine with
cd ensembl-tools-release-84/scripts/variant_effect_predictor
export PERL5LIB=/gpfs/apps/VEP/84:$PERL5LIB
export PATH=/gpfs/apps/VEP/84/htslib:$PATH
perl INSTALL.pl --DESTDIR /gpfs/apps/VEP/84 --CACHEDIR /gpfs/apps/VEP/84/cache --PLUGINS all
and installs VEP as expected, when I ask it to download the cache files it fails at the end with
- downloading ftp://ftp.ensembl.org/pub/release-84/variation/VEP/xiphophorus_maculatus_vep_84_Xipmac4.4.2.tar.gz
- unpacking xiphophorus_maculatus_vep_84_Xipmac4.4.2.tar.gz
ERROR: For technical reasons this installer is unable to install GRCh37 caches alongside others; please install them separately
and I can't find any help as to what I should do next. For example, though, /gpfs/apps/VEP/84/cache/homo_sapiens has directories 84_GRCh37 and 84_GRCh38 which seem fully populated, though, so can this be safely ignored?
If I then repeat the installation asking not to install any cache files and ask it to install all of the FASTA files the third one fails with
ERROR: Could not change directory to dna
and the installer then terminates, rather than trying the next one. I think this comes from lines 737 to 739 in INSTALL.pl:
foreach my $sub(split /\//, $3) {
$ftp->cwd($sub) or die "ERROR: Could not change directory to $sub\n$@\n";
}
and suggests that there are some missing directories on the download site for these files. Is this the case and, if so, is there any way around this apart from rerunning the installer for each of the 70 options one-by-one, which would take quite a while?
Apologies again if this is the wrong list and/or these are simplistic questions. Any help is much appreciated.
Regards
--
Paul Hatton
High Performance Computing and Visualisation Specialist
IT Services, The University of Birmingham
Ph: 0121-414-3994 Mob:07785 977340 Skype:P.S.Hatton
[Service Manager, Birmingham Environment for Academic Research]
[Also Technical Director, IBM Visual and Spatial Technology Centre]
_______________________________________________
Dev mailing list Dev at ensembl.org<mailto:Dev at ensembl.org>
Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
Ensembl Blog: http://www.ensembl.info/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20160426/eaef0b1a/attachment.html>
More information about the Dev
mailing list