[ensembl-dev] VEP offline script: Checking/creating FASTA index fails to find existing index

Cyriac Kandoth kandoth at cbio.mskcc.org
Mon Oct 20 17:18:04 BST 2014


Thanks for troubleshooting this. I will keep at it, and try to find a
better working solution.

This was done on a CentOS 6.5 server, but since I didn't have sudo rights
to install Perl libs, I used perlbrew... which may be a suspect.

~Cyriac

On Mon, Oct 20, 2014 at 11:30 AM, mag <mr6 at ebi.ac.uk> wrote:

>  Hi Cyriac,
>
> As Will said, this is a Bioperl issue.
>
> The module Bio::DB::Fasta is responsible for the indexing
> According to the documentation (
> http://search.cpan.org/dist/BioPerl-1.6.901/Bio/DB/Fasta.pm), it will use
> the AnyDBM module to know how to index the file
>
> The type of index created seems to depend on the environment you're
> running in.
> We have noticed the creation of .pag and .dir indexes in limited linux
> distributions (for example VMs) which might be missing the required
> executables
> DB::Fasta is then unable to identify this as a correct index and keeps
> re-indexing the file although nothing has changed
>
> One workaround is to manually edit your DB::Fasta file, by removing the
> force_index
> -  my $reindex = $force_reindex || $indextime < $modtime;
> +  my $reindex = 0; # $force_reindex || $indextime < $modtime;
> It does mean though that it will not pick up if your file has changed, so
> you would need to edit this every time you get a new fasta file
>
> If you can find a working solution, I would be interested to hear about it.
>
>
> Regards,
> Magali
>
>
> On 20/10/2014 09:29, Will McLaren wrote:
>
>   Hi Cyriac,
>
>  This is not something I've come across before; the FASTA indexing is
> performed by code that we do not maintain (the Bio::DB::Fasta module is
> part of the BioPerl package).
>
>  Which version of BioPerl are you using (there are known issues with
> 1.2.3, though not this issue AFAIK, the VEP installs 1.6.0)? And are you
> using a single FASTA file or a directory containing multiple FASTA files?
>
>  For VEP it is normal that it just generates the .fa.index file; I have
> never seen the other two you mention (perhaps they appear with a directory
> of files rather than a single .fa).
>
>  I'd try removing the indexes and reindexing, or removing the .fa file
> and re-downloading/re-generating it.
>
>  HTH
>
>  Will
>
>  On 18 Oct 2014 02:33, "Cyriac Kandoth" <kandoth at cbio.mskcc.org> wrote:
>
>>  Hi Devs,
>>
>>  The code to check whether a FASTA index needs to be created, looks for a
>> file with extension ".fa.index". However, (and this may be recent) the
>> indexes created are files named ".fa.index.dir" and ".fa.index.pag". I
>> haven't checked the code to confirm this. I'm assuming this is the case,
>> since VEP appears to index the FASTA everytime it runs, unless I create a
>> copy of ".fa.index.pag" with extension ".fa.index".
>>
>>  Cheers!
>>
>>  ~Cyriac
>>
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20141020/ea831d3c/attachment.html>


More information about the Dev mailing list