[ensembl-dev] Performance Analysis and Hardware Acceleration of VEP.

Muhammad Ali Akhtar muhammadali201 at gmail.com
Tue Aug 30 19:30:49 BST 2022


Dear Syed Hossain,

Thanks for the response. The article you referred to was published in 2016.
There have been many vep releases after that and I think a lot has changed.
Are you saying that the runtime / performance numbers cited in that article
in 2016 are still valid?

Secondly, you mentioned one underlying computation step (  linkage
disequilibrium stats ). Can you point me to any publication / study that
lists the algorithms (and mathematical operations ) used in VEP or other
variant analysis tools? If there are no studies, can you atleast point me
to perl files that implement   linkage disequilibrium stats  step.

Third: I read about the fork  / multithreaded option. However, this option
is still meant for CPU execution i.e. Multi-Core CPU execution. We can
create many (100-200) simple cores (PEs) in a mid-sized FPGA chip. However,
these simple cores will not be as complex as Xeon Processor cores. e.g.
FPGA cores are not good for Floating Point Ops.

I was just wondering if there are any mathematical operations in VEP
algorithms that can only be run on CPUs. In other words, there are steps in
VEP processing that are not suitable for GPUs / FPGA chips.
FPGA chips are better at simple Integer Arithmetic and logical operations
and if mathematical operations in VEP are also simple integer arithmetic
and logical operations, 100 cores in FPGA can theoretically give 100x Speed
Up over One Xeon Core.

Muhammad Ali Akhtar

On Tue, Aug 30, 2022 at 1:52 PM Syed Hossain <snhossain at ebi.ac.uk> wrote:

> Hello Muhammad Ali Akhtar,
>
> Thanks for the query!
>
> 1. So we do have our performance analysis on VEP which was published on
> the latest paper. You can check the result here -
> https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0974-4
> .
>
> 2. VEP have option to run in multiple thread (--fork option, see -
>
> https://www.ensembl.org/info/docs/tools/vep/script/vep_options.html#opt_fork)
>
> which suitable to run on multiple core clusters. And some of the
> underlying computation (such as linkage disequilibrium stats) are in C
> and Perl works as a wrapper. We do have a longer time plan to switch to
> a modern language.
>
> Hope that answers your query.
>
> Best regards,
> Nakib
>
> On 2022-08-27 14:20, Muhammad Ali Akhtar wrote:
> > Hello Everyone,
> >
> > This is my first time interacting with the VEP dev team. Just had a
> > few questions.
> >
> > 1. Are there any studies / reports out there with runtime /
> > performance analysis of VEP?
> > 2. Have there been any efforts on improving VEP performance? like
> > coding in different languages (C/C++/) or using different processors
> > (GPUs / FPGAs)?
> >
> > I ask this because many steps (Like sequence alignment, variant
> > calling)  in the genome processing pipeline are being targeted for
> > hardware acceleration but I couldn't find anything on customising /
> > improving VEP run time. Is it because the runtime of VEP has been
> > short / average even for most demanding workloads?
> >
> > Regards,
> > Muhammad Ali Akhtar
> > _______________________________________________
> > Dev mailing list    Dev at ensembl.org
> > Posting guidelines and subscribe/unsubscribe info:
> > https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org
> > Ensembl Blog: http://www.ensembl.info/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20220830/60a23323/attachment.html>


More information about the Dev mailing list