(Human Genome derived from UCSC Genome Browser)
Next Generation Sequencing (NGS) technology is advancing at a rapid pace. New DNA sequencers from Illumina and ThermoFisher/LifeTechnologies can generate tens of millions to several billion reads (DNA/cDNA fragments) in a few hours or days. With barcoding and multiplexing, several samples can be run on a single chip or lane. This combination of high read count and multiple samples per run presents a real challenge for the bioinformatics data analysis community.
NGS data analysis often includes a time-consuming sequence alignment step. Typically, the DNA reads coming off the sequencers are aligned to a so-called reference genome. This step attempts to map sample reads to the reference genome, in which the results are used for SNP analysis and other types of reports. The current human reference genome contains about 3.1 billion nucleotide bases. This is a fairly large genome (although there are considerably larger genomes in other species). Sequence alignment of NGS sample reads against the reference human genome can take a few hours to several hours to run on a reasonably high-end server or cluster. Although this is not excessive for a single sample, with the shear number of samples handled by current sequencers, and with multiple sequencers in an NGS Core or lab, the accumulated time spent in the sequence alignment step can be problematic.
To help alleviate this alignment issue, we have been evaluating parallel sequence alignment algorithms and software alignment tools. As mentioned in an earlier blogpost, we’re testing the NVBIO suite on Nvidia GPU’s. nvBowtie is a GPU-enabled sequence alignment tool in this software suite. bowtie2 is a popular CPU-only counterpart to nvBowtie. To compare the performance of these two applications, we performed a DNA sequence alignment benchmark test with human genome datasets. The results are presented here.
We obtained a sample test case from the NCBI Sequence Read Archive (SRA). The SRA contains raw sequencing data from a variety of NGS vendors, DNA sequencing instruments, and research centers. I recommend using the DNAnexus Sequence Read Archive+ query interface to search SRA; it’s considerably more user-friendly than the direct SRA interface.
For the benchmark tests, we used SRA data from the following project:
- SRA run ID: SRR1058066
- NGS platform: Illumina HiSeq 2500
- Read info: 44,433,542 paired-end reads
- Species: Homo sapiens
- Project title: “Targeted next-generation sequencing of head and neck squamous cell carcinoma identifies novel genetic alterations in HPV and HPV-tumors.” For more information about the study see Lechner et al..
We used human genome build hg38 as the reference genome. Therefore, we ran sequence alignments of SRR1058066 against hg38.
Nvidia Tesla K80 GPU
(Tesla K80 GPU)
- 4,992 GPU cores
- 24 GB GDDR5 RAM
- 480 GB/sec. memory bandwidth
- 300 W power consumption (important spec!)
We ran the benchmark tests on a single K80 GPU in an Intel processor cluster.
Sequence Alignment Tools
For the benchmark comparisons we used nvBowtie v.0.9.9.3 from the NVBIO suite and bowtie2 v.2.2.4 sequence alignment tools. nvBowtie is designed for highly-parallel GPU-only sequence alignments, whereas bowtie2 is designed for moderately-parallel CPU-only alignments. In a sense this is a comparison of fine-grained vs. coarse-grained parallelism.
The numerical simulations needed for this work were performed on Microway’s Tesla GPU Test Drive accelerated compute cluster. We thank Microway for providing access to a GPU-enabled cluster to run the benchmark codes.
For the CPU-only tests, we ran bowtie2 with the following system and run configurations:
- Cray XT6m
- (2) 12-core AMD Opteron 6100 CPU’s per compute node
- 32 GB RAM per compute node
- 12 CPU-threads
- Ran bowtie2 on a single compute node
The results of the CPU-only sequence alignment runs were:
- 206 min. or 3.4 hrs.
For the GPU-only tests, we ran nvBowtie with the following system and run configurations:
- Intel cluster
- (2) 12-core Xeon E5-2680v3 CPU’s per compute node
- 128 GB host CPU RAM per compute node
- (1) Nvidia Tesla K80 GPU per compute node
- Ran nvBowtie on a single compute node
The results of the GPU-only sequence alignment runs were:
- 16 min. or 0.25 hrs.
Comparing the CPU-only bowtie2 run versus the GPU-only nvBowtie run, the recorded speedup was:
- 206 min. / 16 min. = 12.8X
The chart below shows the speedup of nvBowtie vs. bowtie2.
(Speedup of human genome sequence alignment with CPU-only bowtie2 vs. GPU-only nvBowtie)
The 12.8X speedup of nvBowtie on a K80 GPU is encouraging. For human genome sequence alignments, this reduced the wall-clock run time from several hours to a few minutes.
The older AMD Opteron series processors used in these tests have been superseded by newer AMD FX series and Intel 5th generation processors, among others. The speedups seen here would undoubtedly be reduced somewhat when compared to newer CPU processor models. However, nvBowtie is currently at version 0.9 (not even version one yet). We would expect continued algorithm development and optimization in nvBowtie to result in performance improvements for the code, thus maintaining speedups somewhere in the 8X – 10X range. In any event, shaving hours off the sequence alignment runtimes is significant.
NGS Pipeline Updates
In our NextGen Sequencing Core we manage numerous bioinformatics pipelines for the analysis of NGS datasets, including:
In each pipeline the main computational bottleneck is the DNA (cDNA) sequence alignment step. This step can take a few hours to several hours to run on our high-end servers and clusters. In a busy NGS Core with a large number of samples to process, this can be a significant impediment for data analysis. Given the encouraging results from the nvBowtie benchmark runs described here, we are migrating our sequence alignment tools to NVBIO. We’ll run A/B-tests with bowtie2 and nvBowtie to compare performance and numerical results between the two sequence alignment tools. And we’ll see if this improves the turnaround time for bioinformatics data analysis.