Imec’s elPrep Software Significantly Speeds Up Genome Sequencing Analysis

San Francisco (USA), OCTOBER 18, 2018 — This week at ITF Health 2018, imec, the world-leading research and innovation hub in nanoelectronics and digital technologies, showcases elPrep 4.0, a powerful software tool to speed up human DNA sequencing analysis. elPrep accelerates whole genome and exome processing pipelines up to an order of magnitude, saving a typical lab hundreds of hours of compute processing and allowing more and faster DNA tests. elPrep 4.0 is designed as a drop-in replacement for preparation steps defined by the GATK (Genome Analysis Toolkit) Best Practices pipelines for variant calling, while delivering identical results.

DNA sequencing involves splitting a human genome into thousands of fragments, which are then fed to the sequencing machines to identify the individual bases. This results in huge data files that are processed through a pipeline of tools to reconstruct the original DNA sequence from the fragments and to flag variants that may point to e.g. genetic disorders (also known as variant calling). Data sets for human whole genome DNA are usually on the order of several hundreds of GB of uncompressed data, resulting in processing runtimes typically on the order of tens of hours per genome.

elPrep software is designed to speed up DNA sequencing analysis up to an order of magnitude. The new version 4.0 executes all preparation steps until variant calling. It replaces other DNA sequencing analysis software such as GATK4.0, Picard, and SAMtools while producing identical results. What sets elPrep apart is its architecture that allows executing pipelines by making only a single pass through the data, no matter how long the pipeline is.

elPrep is designed as a multi-threaded application that runs entirely in memory, avoids repeated file I/O, and merges the computation of data of several DNA sequencing preparation steps. As a result, in a typical run, elPrep is up to ten times faster than other software tools using the same resources. It is designed as a seamless replacement that delivers the exact same results as GATK4.0 developed by the Broad Institute. elPrep has been written in the Go programming language and is available through the open-source GNU Affero General Public License v3 (AGPL-3.0).

Imec’s ExaScience Life Lab is an imec lab focused on providing software solutions for data-intensive high-performance computing problems, primarily in the life sciences domain. It solves data-intensive computational bottlenecks and by doing so helps companies develop solutions for complex problems involving multiple disciplines. Examples of successful projects include large-scale machine learning for pharmaceutical companies, DNA sequencing software for hospitals and pharmaceutical companies, assay image feature extraction, advanced biostatistics and data analytics, and even multi-physics space weather simulations. The work on elPrep 4.0 was partially funded through the imec.icon research project GAP, a research project to optimize the ICT infrastructure for whole genome sequencing in hospitals, in collaboration with Bluebee, Western Digital, Agilent, Ghent University, KU Leuven, and the academic hospital UZ Leuven.

Read more
Hanne Degans
press communications manager - imec

About imec

Imec is the world-leading research and innovation hub in nanoelectronics and digital technologies. The combination of our widely acclaimed leadership in microchip technology and profound software and ICT expertise is what makes us unique. By leveraging our world-class infrastructure and local and global ecosystem of partners across a multitude of industries, we create groundbreaking innovation in application domains such as healthcare, smart cities and mobility, logistics and manufacturing, energy and education.

As a trusted partner for companies, start-ups and universities we bring together more than 4,000 brilliant minds from over 85 nationalities. Imec is headquartered in Leuven, Belgium and has distributed R&D groups at a number of Flemish universities, in the Netherlands, Taiwan, USA, China, and offices in India and Japan. In 2017, imec's revenue (P&L) totaled 546 million euro. Further information on imec can be found at www.imec-int.com.

Imec is a registered trademark for the activities of IMEC International (a legal entity set up under Belgian law as a "stichting van openbaar nut”), imec Belgium (IMEC vzw supported by the Flemish Government), imec the Netherlands (Stichting IMEC Nederland, part of Holst Centre which is supported by the Dutch Government), imec Taiwan (IMEC Taiwan Co.) and imec China (IMEC Microelectronics (Shanghai) Co. Ltd.) and imec India (Imec India Private Limited), imec Florida (IMEC USA nanoelectronics design center).

This website uses cookies for analytics purposes only without any commercial intent. Find out more here. Our privacy statement can be found here. Some content (videos, iframes, forms,...) on this website will only appear when you have accepted the cookies.

Accept cookies