[ensembl-dev] Peak memory/gzip use in VEP
morungos at gmail.com
Mon Mar 7 15:48:31 GMT 2016
I’m seeing hefty peak memory use from VEP and it’s breaking some of my cluster jobs. I think one of the issues is that it can spin up many gzip processes temporarily, these showed clearly in top. This was something of a surprise to me, as all the code I could see used IO::Uncompress::* when it was available.
However, I did eventually find that deserialize_from_file in the ensemble-variation API is probably where this is happening. Can I maybe suggest that an option for in-process Perl based deserialization is allowed? My guess is that not running this through piped open() calls will actually speed performance here?
Am I on a sensible track with this?
All the best
More information about the Dev