Compiler Vectorization Prediction

Distribution of instruction cache misses across non-vectorized and vectorized loops

The process of vectorization entails converting the scalar implementation of a computer program into a vector implementation. This project attempts to build on recent work involving the usage of hardware performance counters and techniques from machine learning to predict auto-vectorization of compilers by validating similar machine learning models on a different architecture and compiler, and also shows the benefits of data augmentation through sample synthesis on such applications. Using predictive models along with data augmentation on hardware performance data, I was able to successfully predict whether a compiler was able to auto-vectorize a program.

The project report can be viewed/downloaded below:

The Jupyter notebooks containing the experiments and their results can be downloaded here.

Rahul Sridhar