Name: Aniket Shivam
Chair: Alexander V. Veidenbaum
Date: May 25, 2021
Time: 3:30 PM
Committee: Alexander V. Veidenbaum, Alexandru Nicolau, and Tony Givargis
Title: A Multiple Compiler Approach for Improved Performance and Efficiency
Production compilers have achieved a high level of maturity in terms of generating efficient code. Compilers are embedded with numerous code optimization techniques, with special focus on loop nest optimizations, that have been developed over the last four decades. The code generated by any two production compilers can turn out to be very different based on pros and cons of their respective Intermediate Representation (IR), implemented loop transformations and their ordering, cost models used and even instruction selection (such as vector instructions) and scheduling. The compilers also need to predict the behavior of a multi-core processor which has complex pipelines, multiple functional units, complex memory hierarchy, etc. on the overall performance. Hence, the performance of produced code for a program segment by a given compiler may not necessarily be matched by other compilers. Additionally, there is no way of knowing how close a compiler gets to optimal performance or if there is any headroom for improvement.
The complexity and rigidity of the compilation process makes it very difficult to modify a given compiler to improve the performance of generated code for every case where it couldn’t produce the best possible code. Therefore, this thesis presents a compilation approach that turns the differences between compilation processes and performance optimizations in each compiler from a weakness to a strength. This approach is implemented as a novel compilation framework, the MCompiler. This meta-compilation framework allows different segments of a program to be compiled using an ensemble of compilers/optimizers and combined into a single executable. Utilizing the highest performing code for each segment, identified via Exploratory Search, can lead to a significant overall improvement in performance. The framework is shown to produce performance improvements for serial (including auto-vectorized code), auto-parallelized and hand-optimized (using OpenMP) parallel code
Next, this thesis explores the possibility of learning which compiler will produce the best code for a segment. This is accomplished using Machine Learning. The Machine Learning models learn about inherent characteristics of loop nests and then predict which code optimizer is the most suited for each loop nest in an application. These Machine Learning models are then incorporated into the MCompiler to predict the best code optimizer, during compilation, for each code segment of the application. This feature allows the MCompiler to replace the expensive Exploratory Search with Machine Learning predictions and still keep performance very close to the Exploratory Search.
Finally, this thesis expands the compilation approach to achieve energy efficiency on modern architectures. Prior research has advocated both for and against the hypothesis that optimizing for performance translates into optimizing for energy efficiency. No production compiler optimizes for energy efficiency directly, expecting optimizing for performance to translate into higher energy efficiency. Optimizing for performance is complex for recent generations of processors and, with automatic DVFS management in these processors, optimizing for energy efficiency would add another level of complexity for compilers with no guarantee of success. Using the MCompiler, this thesis shows how the performance-oriented compiler optimizations can be used to achieve energy efficiency.