CS 243 Project - Hardware Counters and Data Augmentation for Predicting Vectorization

Data visualization

Visualize the relationship between the features and the target variable

Data dictionary:
BR_INST_EXEC.ALL_BRANCHES: Speculative and retired branches
Cycles (CPU_CLK_UNHALTED.THREAD_P): Thread cycles when thread is not in halt state
ICACHE.MISSES: # instruction cache, victim cache, and streaming buffer misses. Uncacheable accesses included
Instructions (INST_RETIRED.ANY_P): Number of instructions retired
IPC: Instructions/Cycles
ITLB_MISSES.MISS_CAUSES_A_WALK: Misses at all ITLB levels that causes a page walk
CYCLE_ACTIVITY.CYCLES_L1D_PENDING: Cycles while L1 cache miss demand load is outstanding
L1D.REPLACEMENT: L1D data line replacements
L2_cache_misses (L2_RQSTS.MISS): All requests that miss L2 cache
L2_cache_accesses (L2_RQSTS.REFERENCES): All L2 requests
MACHINE_CLEARS.COUNT: Number of machine clears (nukes) of any type
MACHINE_CLEARS.CYCLES: Cycles where there was a nuke (thread-specific and all thread)
MEM_LOAD_UOPS_RETIRED.L1_MISS: Retired load uops misses in L1 cache as data sources
MISALIGN_MEM_REF.LOADS: Speculative cache line split load uops dispached to L1 cache
RESOURCE_STALLS.ANY: Resource-related stall cycles
UOPS_EXECUTED.CORE: Number of uops executed on the core
DTLB_LOAD_MISSES.MISS_CAUSES_A_WALK: Load misses in all DTLB levels that cause page walks
UOPS_EXECUTED.THREAD: Counts the number of uops to be executed per thread each cycle
UOPS_ISSUED.ANY: Uops that resource allocation table (RAT) issues to reservation station (RS)
UOPS_ISSUED.STALL_CYCLES: Cycles when RAT does not issue uops to RS for the thread
UOPS_RETIRED.ALL: Actually retired uops

In [1]:
%matplotlib inline
In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import os
from Utilities import Utilities as utils
plt.style.use('ggplot')

Import data from all runs

In [3]:
data = pd.read_pickle('Intermediate/Data_final')
print data.shape
data.head()
(151, 23)
Out[3]:
Symbol Name BR_INST_EXEC.ALL_BRANCHES Cycles (CPU_CLK_UNHALTED.THREAD_P) ICACHE.MISSES Instructions (INST_RETIRED.ANY_P) IPC ITLB_MISSES.MISS_CAUSES_A_WALK CYCLE_ACTIVITY.CYCLES_L1D_PENDING L1D.REPLACEMENT L2_cache_misses (L2_RQSTS.MISS) ... MEM_LOAD_UOPS_RETIRED.L1_MISS MISALIGN_MEM_REF.LOADS RESOURCE_STALLS.ANY UOPS_EXECUTED.CORE DTLB_LOAD_MISSES.MISS_CAUSES_A_WALK UOPS_EXECUTED.THREAD UOPS_ISSUED.ANY UOPS_ISSUED.STALL_CYCLES UOPS_RETIRED.ALL Vectorizable
0 vdotr 0.780054 1.000000 0.846777 0.953049 0.333166 0.091284 0.009918 0.352163 0.038586 ... 0.006236 0.001107 0.576490 0.825337 1.000000 0.844710 0.989464 0.814133 0.846817 1
1 vsumr 0.363920 0.994448 1.000000 0.371609 0.120150 0.091871 0.008573 0.177088 0.004370 ... 0.007573 0.000633 0.855463 0.582767 0.769073 0.443357 0.410981 0.703915 0.446982 1
2 s312 0.361293 0.989835 0.758251 0.372705 0.121151 0.254541 0.018059 0.175171 0.011072 ... 0.009501 0.000273 0.923626 0.645358 0.742128 0.431895 0.398940 0.982609 0.427202 1
3 s311 0.348797 0.974573 0.585715 0.366408 0.120901 0.406182 0.028118 0.173145 0.018200 ... 0.009169 0.000294 0.922912 0.631597 0.773651 0.432357 0.405950 0.936244 0.427589 1
4 s233 0.137980 0.992884 0.970062 0.164879 0.043805 1.000000 1.000000 1.000000 1.000000 ... 1.000000 0.000075 1.000000 0.118286 0.098588 0.208184 0.227743 1.000000 0.202813 0

5 rows × 23 columns

In [4]:
data['Vectorizable'].value_counts()
Out[4]:
0    85
1    66
Name: Vectorizable, dtype: int64

Understand the data / Exploratory Data Analysis (EDA)

In [5]:
print data.dtypes
Symbol Name                                 object
BR_INST_EXEC.ALL_BRANCHES                  float64
Cycles (CPU_CLK_UNHALTED.THREAD_P)         float64
ICACHE.MISSES                              float64
Instructions (INST_RETIRED.ANY_P)          float64
IPC                                        float64
ITLB_MISSES.MISS_CAUSES_A_WALK             float64
CYCLE_ACTIVITY.CYCLES_L1D_PENDING          float64
L1D.REPLACEMENT                            float64
L2_cache_misses (L2_RQSTS.MISS)            float64
L2_cache_accesses (L2_RQSTS.REFERENCES)    float64
MACHINE_CLEARS.COUNT                       float64
MACHINE_CLEARS.CYCLES                      float64
MEM_LOAD_UOPS_RETIRED.L1_MISS              float64
MISALIGN_MEM_REF.LOADS                     float64
RESOURCE_STALLS.ANY                        float64
UOPS_EXECUTED.CORE                         float64
DTLB_LOAD_MISSES.MISS_CAUSES_A_WALK        float64
UOPS_EXECUTED.THREAD                       float64
UOPS_ISSUED.ANY                            float64
UOPS_ISSUED.STALL_CYCLES                   float64
UOPS_RETIRED.ALL                           float64
Vectorizable                                 int64
dtype: object
In [6]:
# Variable names
columns = data.columns.values.tolist()
In [7]:
data.describe()
Out[7]:
BR_INST_EXEC.ALL_BRANCHES Cycles (CPU_CLK_UNHALTED.THREAD_P) ICACHE.MISSES Instructions (INST_RETIRED.ANY_P) IPC ITLB_MISSES.MISS_CAUSES_A_WALK CYCLE_ACTIVITY.CYCLES_L1D_PENDING L1D.REPLACEMENT L2_cache_misses (L2_RQSTS.MISS) L2_cache_accesses (L2_RQSTS.REFERENCES) ... MEM_LOAD_UOPS_RETIRED.L1_MISS MISALIGN_MEM_REF.LOADS RESOURCE_STALLS.ANY UOPS_EXECUTED.CORE DTLB_LOAD_MISSES.MISS_CAUSES_A_WALK UOPS_EXECUTED.THREAD UOPS_ISSUED.ANY UOPS_ISSUED.STALL_CYCLES UOPS_RETIRED.ALL Vectorizable
count 151.000000 151.000000 151.000000 151.000000 151.000000 151.000000 151.000000 151.000000 151.000000 151.000000 ... 151.000000 151.000000 151.000000 151.000000 151.000000 151.000000 151.000000 151.000000 151.000000 151.000000
mean 0.242500 0.212024 0.195325 0.266251 0.551535 0.102550 0.044080 0.125738 0.066211 0.097334 ... 0.055874 0.013193 0.102666 0.199156 0.099130 0.254646 0.277286 0.157823 0.256343 0.437086
std 0.222994 0.217685 0.207413 0.224682 0.258924 0.152498 0.124071 0.148057 0.137475 0.134491 ... 0.156027 0.097789 0.189176 0.193757 0.156507 0.222378 0.226364 0.197243 0.222170 0.497677
min 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
25% 0.074780 0.074332 0.057193 0.111349 0.350939 0.023209 0.003909 0.042992 0.007141 0.036498 ... 0.002422 0.000007 0.012942 0.076519 0.008964 0.104077 0.117930 0.045707 0.105412 0.000000
50% 0.151101 0.121562 0.115710 0.192760 0.619274 0.051483 0.009426 0.070256 0.030526 0.058990 ... 0.006834 0.000024 0.037662 0.129832 0.046054 0.181226 0.205073 0.083049 0.184905 0.000000
75% 0.308285 0.273407 0.264857 0.389335 0.766583 0.112528 0.026489 0.174158 0.058429 0.106982 ... 0.019599 0.000080 0.080891 0.236636 0.114080 0.369996 0.402445 0.170930 0.371584 1.000000
max 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 ... 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000

8 rows × 22 columns

In [8]:
# Get names of all continuous variables
continuous_vars = [columns[i] for i in np.where(data.dtypes != 'O')[0]]
continuous_vars.remove('Vectorizable') # Remove target variable
print len(continuous_vars)
21
In [11]:
# Plot the distribution of each variable in vectorized and non-vectorized loops
for var in continuous_vars:
    plt.figure(figsize = (16, 5))
    plt.subplot(121)
    plt.hist(data.loc[data['Vectorizable']==0, var])
    #plt.title('Vectorizable = 0 vs. '+var)
    plt.ylabel('Number of non-vectorized loops')
    plt.xlabel(var)
    
    plt.subplot(122)
    plt.hist(data.loc[data['Vectorizable']==1, var])
    plt.ylabel('Number of vectorized loops')
    plt.xlabel(var)
    #plt.title('Vectorizable = 1; '+var)
    plt.show()

Interesting variables:
icache misses, inst retired any, itlb misses causes a walk, l1d replacement, l2 cache accesses, resource stalls any, uops executed core and thread, uops issued any, uops retired all

In [ ]:
 

t-SNE visualization

Reduce data dimensionality and look at the distribution of the target variable in the new data space

In [15]:
from sklearn.manifold import TSNE
In [21]:
tsne = TSNE(n_components=2, random_state = 2)
embs = tsne.fit(data.iloc[:, 1:22])
In [33]:
plt.scatter(embs.embedding_[:, 0], embs.embedding_[:, 1], c=data.Vectorizable)
Out[33]:
<matplotlib.collections.PathCollection at 0x117ce7750>