Name: Chenying Hsieh

Date: Wednesday Aug 7, 2019

Time: 2:00 pm
Location: Donald Bren Hall 4011
Committee: Nikil Dutt (Chair), Tony Givargis, and Ardalan Amiri Sani
Title: Cooperative Power and Resource Management for Heterogeneous Mobile Architectures
Heterogeneous architectures have been ubiquitous in mobile system-on-chips (SoCs). The demand from different application domains such as games, computer vision and machine learning which requires massive parallelism of computation has driven the integration of more accelerators into mobile SoCs to provide satisfactory performance energy-efficiently.These on-chip computing resources typically have their individual runtime systems including: (1) a software governor: continuously monitors hardware utilization and makes decisions of trade-off between performance and power consumption. (2) software stack: allows application developers to program the hardware for general purpose computation and perform memory management and profiling. As computation of mobile applications may demand all sorts of combinations of computing resources, we identify two problems: (1) individual runtime can often lead to poor performance-power trade-off or inefficient utilization of computing resources. (2) existing approaches fail to schedule subprograms among different computing resources and further lose the opportunity to avoid resource contention to gain better performance.
To address these issues, we propose a holistic approach to coordinate different runtime regrading application performance and energy efficiency in this dissertation. We first present MEMCOP, a memory-aware collaborative CPU-GPU governor that considers both the memory access footprint as well as the CPU/GPU frequency to improve energy efficiency of high-end mobile game workloads by performing dynamic voltage and frequency scaling (DVFS). Second, We present a case study executing a mix of popular data-parallel workloads such as convolutional neural networks (CNNs), computer vision filters and graphics rendering kernels on mobile devices, and show that both performance and energy consumption of mobile platforms can be improved by synergistically deploying these underutilized compute resources. Third, we present SURF: a self-aware unified runtime framework for parallel programs on heterogeneous mobile architectures. SURF supports several heterogeneous parallel programming languages (including OpenMP and OpenCL), and enables dynamic task-mapping to heterogeneous resources based on runtime measurement and prediction.The measurement and monitoring loop enables self-aware adaptation of run-time mapping to exploit the best available resource dynamically.
We implemented all the software components on real-world mobile SoCs and evaluate our proposed approaches with mobile games and mix of parallel benchmarks and CNN applications accordingly.