Hyper-Q is an exciting new feature in the Nvidia Kepler architecture. In current Fermi systems, a single CPU core can launch a single MPI task or CUDA stream on a GPU. In Kepler systems, Hyper-Q allows multiple CPU cores to launch up to 32 MPI tasks or CUDA streams simultaneously on a GPU. Hyper-Q thus increases the utilization and efficiency of GPU workloads, which in turn decreases CPU idle time. For well-designed algorithms Hyper-Q could result in a 32X increase in software application performance.
With Hyper-Q, GPU utilization increases dramatically thus allowing GPU’s to operate near peak efficiency. Hyper-Q removes a performance bottleneck that exists in current generation Fermi systems. CUDA software developers can exploit this increased efficiency to speedup existing code with a minimal amount of code revisions. GPU cores will no longer have to sit around waiting for work from CPU’s as Hyper-Q dispatches a much greater workload. Along with Dynamic Parallelism and several other new features, Kepler’s Hyper-Q promises even more dramatic performance improvements for existing applications.