[ensembl-dev] VEP error: Forked process failed.

Will McLaren wm2 at ebi.ac.uk
Tue May 14 09:42:17 BST 2013


Hello,

Your aa_grantham_distance plugin is somewhat inefficient - it retrieves the
peptide alleles from the HGVS annotation, which itself requires some
database fetching and processing to produce. This is why it is slow.

You can get the peptides from the transcript variation object:

my @peps = split "/", $tva->transcript_variation->pep_allele_string();

This will give you single-letter AA codes, but you could either modify your
hash or use BioPerl to convert:

$seqobj = Bio::PrimarySeq->new ( -seq => $single_letter_aa);
$three_letter_aa = Bio::SeqUtils->seq3($seqobj);

You should also declare your distances hash in the new() sub and store it
on $self; this will also marginally speed up your plugin.

Regarding the forking issues, we are working on improving stability under
forking.

Thanks for your patience

Will


On 14 May 2013 07:37, Guillermo Marco Puche <
guillermo.marco at sistemasgenomicos.com> wrote:

>  Hello,
>
> I'm not really sure which one of those plugins is causing the fork error.
> I cannot recreate it now running each one of them separately.
>
> Here are both:
>
> https://github.com/guillermomarco/vep_plugins_71
>
> They also slow the calculating consequences process a lot.
> aa_grantham_distance.pm is just a hardcoded plugin from one of the
> biologists in my work. It was just a pure copy paste and adaptation to make
> it work as a VEP plugin. Maybe the problem is in the matrix definition
> every time the sub routine is called. I'm not running out of memory nor
> CPU. I'm currently using it with 2 threads and buffersize of 500 for a 5000
> variant vcf file.
>
> I'm my honest opinion, I think one (or even both) of those plugins are
> slowing so much the calculating process that sometimes the fork just dies.
> Like when you have a timeout during to heavy network traffic. So when you
> use them together with lot of other plugins like Condel, Consequence, etc..
> they may be causing the process to handle and die.
>
> Best regards,
> Guillermo.
>
>
> On 05/13/2013 03:55 PM, Duarte Molha wrote:
>
> I also get this error... it is so prevalent and so difficult to pinpoint
> what is causing it that I have given up on forking my annotation process.
>
>  I do think it is related to the number of forks. It seems to crash less
> often if you use a low number of forks... anything above 5
> will undoubtedly crash the script at least in my experience.
>
>  Cheers
>
> Duarte
>
> =========================
>      Duarte Miguel Paulo Molha
>           http://about.me/duarte
> =========================
>
>
> On Mon, May 13, 2013 at 2:50 PM, Will McLaren <wm2 at ebi.ac.uk> wrote:
>
>> Hi Guillermo,
>>
>>  Test each plugin individually until you find the one that causes the
>> error. It is highly unlikely that a particular combination of plugins is
>> causing the crash.
>>
>>  Check that there are no "print" (to STDOUT or STDERR) statements in
>> your plugin - forking assumes that code remains silent otherwise it will
>> throw errors like this.
>>
>>  Also, check what, if anything, is cached between runs of your plugin.
>> If you are caching things (for example to avoid re-querying a database),
>> you may need to write storable hooks to ensure the data is getting cached
>> between forks - see
>> https://github.com/ensembl-variation/VEP_plugins/blob/master/ProteinSeqs.pmfor an example.
>>
>>  If you still have no luck, send me the code and an input file that
>> recreates the problem.
>>
>>  Regards
>>
>>  Will
>>
>>
>>  On 13 May 2013 13:18, Guillermo Marco Puche <
>> guillermo.marco at sistemasgenomicos.com> wrote:
>>
>>>   Hello,
>>>
>>> I've started to recently having problems with VEP script while using
>>> different plugins (most of them own plugins).
>>>
>>> 2013-05-13 13:59:44 - Connected to core version 71 database and variation version 71 database
>>> 2013-05-13 13:59:44 - Loaded plugin: vcf_input
>>> 2013-05-13 13:59:44 - Loaded plugin: biobase
>>> 2013-05-13 13:59:44 - Loaded plugin: aa_grantham_distance
>>> 2013-05-13 13:59:44 - Loaded plugin: flanking_sequence
>>> 2013-05-13 13:59:44 - Loaded plugin: Condel
>>> 2013-05-13 13:59:44 - Output fields redefined (37 defined)
>>> 2013-05-13 13:59:44 - Starting...
>>> 2013-05-13 13:59:45 - Read 3888 variants into buffer
>>> 2013-05-13 13:59:54 - Reading transcript data from cache and/or database
>>> [===============================================]  [ 100% ]
>>> 2013-05-13 14:02:38 - Retrieved 6463 transcripts (0 mem, 0 cached, 13743 DB, 7280 duplicates)
>>> 2013-05-13 14:02:38 - Calculating consequences
>>> [===================================>           ]   [ 78% ]
>>> ERROR: Forked process failed
>>>
>>>
>>>
>>> I'm not getting any other error message. So I cannot debug properly. I
>>> thought my plugins were OK but it's seems they don't. I think the problem
>>> occurs when I use "aa_grantham_distance plugin" together with
>>> "flanking_sequence". I've no idea what could be causing this.
>>>
>>> I'm running VEP on verbose mode but I can't get any usefull information.
>>> How could I debug that?
>>>
>>> Best regards,
>>> Guillermo.
>>>
>>>
>>>  _______________________________________________
>>> Dev mailing list    Dev at ensembl.org
>>> Posting guidelines and subscribe/unsubscribe info:
>>> http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog: http://www.ensembl.info/
>>>
>>>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20130514/829f6b8c/attachment.html>


More information about the Dev mailing list