|Title||ONAC: Optimal Number of Active Cores Detector for Energy Efficient GPU Computing|
|Publication Type||Conference Papers|
|Authors||Zhu, X., M. Awatramani, D. Rover, and J. Zambreno|
|Conference Name||Proceedings of the International Conference on Computer Design (ICCD)|
Graphics Processing Units (GPUs) have become a prevalent platform for high throughput general purpose computing. The peak computational throughput of GPUs has been steadily increasing with each technology node by scaling the number of cores on the chip. Although this vastly improves the performance of several compute-intensive applications, our experiments show that some applications can achieve peak performance without utilizing all cores on the chip. We refer to the number of cores at which performance of an application saturates as the optimal number of active cores (Nopt). We propose executing the application on Nopt cores, and power-gating the unused cores to reduce static power consumption.
Towards this target, we present ONAC (Optimal Number of Active Cores detector), a runtime technique to detect Nopt. ONAC uses a novel estimation model, which significantly reduces the number of hardware samples taken to detect the optimal core count, compared to a sequential detection technique (SeqDet). We implement ONAC and Seq-Det in a cycle-level GPU performance simulator and analyze their effect on performance, power and energy. Our evaluation shows that ONAC and Seq-Det can reduce energy consumption by 20% and 10% on average for memory-intensive applications, without sacrificing more than 2% performance. The higher energy savings for ONAC comes from reducing the detection time by 45% as compared to Seq-Det.