CMOS and beyond CMOS
Discover why imec is the premier R&D center for advanced logic & memory devices. anced logic & memory devices.
Connected health solutions
Explore the technologies that will power tomorrow’s wearable, implantable, ingestible and non-contact devices.
Life sciences
See how imec brings the power of chip technology to the world of healthcare.
Sensor solutions for IoT
Dive into innovative solutions for sensor networks, high speed networks and sensor technologies.
Artificial intelligence
Explore the possibilities and technologies of AI.
More expertises
Discover all our expertises.
Be the first to reap the benefits of imec’s research by joining one of our programs or starting an exclusive bilateral collaboration.
Build on our expertise for the design, prototyping and low-volume manufacturing of your innovative nanotech components and products.
Use one of imec’s mature technologies for groundbreaking applications across a multitude of industries such as healthcare, agriculture and Industry 4.0.
Venturing and startups
Kick-start your business. Launch or expand your tech company by drawing on the funds and knowhow of imec’s ecosystem of tailored venturing support.
/Expertise/Life sciences/Genomics and sequencing 4.0/elPrep: smarter, faster DNA sequence analysis software

elPrep: smarter, faster DNA sequence analysis software

Streamline your genomic research with an integrated tool that excels in speed and accuracy.

If genome sequencing is an important part of your medical practice or research, time is all too often not on your side. After identification of the individual bases through sequencing hardware, hundreds of gigabytes of data need to be processed to reconstruct the DNA sequence and flag variants that might indicate genetic disorders.

It’s a procedure that typically involves a series of DNA sequence analysis software tools and takes a lot of time – hampering your research and delaying your results. Unless you speed up the process with elPrep, of which version 5 now includes support for variant calling.

We’re ready to assist you

Faster DNA sequence analysis

elPrep is a DNA sequence analysis software solution that’s up to sixteen times faster than other programs on similar computing hardware, all without using expensive GPU or FPGA acceleration The reason for this remarkable increase of speed? A smart software architecture that:

  • combines the processing of multiple genome sequencing preparation steps and parallelizes their execution
  • optimizes memory management
  • minimizes the number of I/O accesses

WGS Benchmark*. Runtime, peak RAM, and disk use in GATK 4 (colored) vs. elPrep 5 (grey). The runtime/resource use for GATK 4 are shown per step in the pipeline, whereas all steps are combined into a single data point for elPrep 5. The GATK 4 runs were executed for both versions of the haplotype caller algorithm it implements. In comparison to GATK 4, elPrep 5 executes the pipeline 8.5-16x faster using +- 0.70x of the RAM and +- 0.70x of the disk space GATK 4 uses. The outputs of elPrep are identical to the GATK outputs. (*50x NA12878 Illumina Platinum genome, hg38, run on AWS m5.24xlarge, Intel Xeon, 96 vCPU, 384 GiB RAM)

Comprehensive DNA sequence analysis software

elPrep is developed by ExaScience Life Lab, a division of imec that focuses on software solutions for data-intensive and high-performance computing problems, primarily in life sciences. Thanks to this expertise, elPrep is a tool that produces identical results to established genome analysis programs such as SAMtools, Picard and GATK4.

Moreover, elPrep seamlessly replaces all these other tools, including variant calling. Giving you a single, ultra-fast solution for a large part of the DNA sequence analysis process.

Figure 2 AWS WGS Scaling Benchmark**. The graph shows the runtime (left) and the dollar cost (right) for running the variant calling pipeline on a variety of AWS server instances. The fastest elPrep run is more than 8x faster for roughly the same prices as GATK. Concretely, elPrep processes the WGS sample < 6 hours for +- 32 dollars. (**M5.2xlarge: 8 vCPU, 32 GiB, 046$/hour. M5.16xlarge: 64 vCPU, 256 GiB, 3.68$/hour. M5.24xlarge: 96vCPU, 384 GiB, 5.52$/hour (September 2020 prices for EU Frankfurt))

Want to use elPrep?

The elPrep DNA sequence analysis software is validated by Janssen Pharmaceutica and Seven Bridges Genomics, and used for production at various Belgian hospitals and companies such as BlueBee. It’s written in Go, an open-source program language, and can run on any standard server on-premise or in the cloud.

The elPrep source code is freely available on GitHub. Need customization or support? Don’t hesitate to contact us.

Get in touch

Download the elPrep binaries


How can we help you?

Send us your request

I'm looking for