Streamline your genomic research with an integrated tool that excels in speed and accuracy.
If genome sequencing is an important part of your medical practice or research, time is all too often not on your side. After identification of the individual bases through sequencing hardware, hundreds of gigabytes of data need to be processed to reconstruct the DNA sequence and flag variants that might indicate genetic disorders.
It’s a procedure that typically involves a series of DNA sequence analysis software tools and takes a lot of time – hampering your research and delaying your results. Unless you speed up the process with elPrep, of which version 5 now includes support for variant calling.
elPrep is a DNA sequence analysis software solution that’s up to sixteen times faster than other programs on similar computing hardware, all without using expensive GPU or FPGA acceleration The reason for this remarkable increase of speed? A smart software architecture that:
WGS Benchmark*. Runtime, peak RAM, and disk use in GATK 4 (colored) vs. elPrep 5 (grey). The runtime/resource use for GATK 4 are shown per step in the pipeline, whereas all steps are combined into a single data point for elPrep 5. The GATK 4 runs were executed for both versions of the haplotype caller algorithm it implements. In comparison to GATK 4, elPrep 5 executes the pipeline 8.5-16x faster using +- 0.70x of the RAM and +- 0.70x of the disk space GATK 4 uses. The outputs of elPrep are identical to the GATK outputs. (*50x NA12878 Illumina Platinum genome, hg38, run on AWS m5.24xlarge, Intel Xeon, 96 vCPU, 384 GiB RAM)
elPrep is developed by ExaScience Life Lab, a division of imec that focuses on software solutions for data-intensive and high-performance computing problems, primarily in life sciences. Thanks to this expertise, elPrep is a tool that produces identical results to established genome analysis programs such as SAMtools, Picard and GATK4.
Moreover, elPrep seamlessly replaces all these other tools, including variant calling. Giving you a single, ultra-fast solution for a large part of the DNA sequence analysis process.
Figure 2 AWS WGS Scaling Benchmark**. The graph shows the runtime (left) and the dollar cost (right) for running the variant calling pipeline on a variety of AWS server instances. The fastest elPrep run is more than 8x faster for roughly the same prices as GATK. Concretely, elPrep processes the WGS sample < 6 hours for +- 32 dollars. (**M5.2xlarge: 8 vCPU, 32 GiB, 046$/hour. M5.16xlarge: 64 vCPU, 256 GiB, 3.68$/hour. M5.24xlarge: 96vCPU, 384 GiB, 5.52$/hour (September 2020 prices for EU Frankfurt))
The elPrep DNA sequence analysis software is validated by Janssen Pharmaceutica and Seven Bridges Genomics, and used for production at various Belgian hospitals and companies such as BlueBee. It’s written in Go, an open-source program language, and can run on any standard server on-premise or in the cloud.
The elPrep source code is freely available on GitHub. Need customization or support? Don’t hesitate to contact us.