The data consists of 48x48 pixel grayscale images of faces. The faces have been automatically registered so that the face is more or less centered and occupies about the same amount of space in each image. The task is to categorize each face based on the emotion shown in the facial expression in to one of seven categories (0=Angry, 1=Disgust, 2=Fear, 3=Happy, 4=Sad, 5=Surprise, 6=Neutral).
train.csv contains two columns, "emotion" and "pixels". The "emotion" column contains a numeric code ranging from 0 to 6, inclusive, for the emotion that is present in the image. The "pixels" column contains a string surrounded in quotes for each image. The contents of this string a space-separated pixel values in row major order. test.csv contains only the "pixels" column and your task is to predict the emotion column.
The training set consists of 28,709 examples. The public test set used for the leaderboard consists of 3,589 examples. The final test set, which was used to determine the winner of the competition, consists of another 3,589 examples.
This dataset was prepared by Pierre-Luc Carrier and Aaron Courville, as part of an ongoing research project. They have graciously provided the workshop organizers with a preliminary version of their dataset to use for this contest.
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import cv2
from read_data import *
import matplotlib.gridspec as gridspec
# Commented because these are meant to be executed only once
# Takes a long time to run
# Read the data from the csv file
#filename = '../Data/fer2013/fer2013.csv'
#filename = os.path.join(curdir,filename)
#gen_record(filename, 1)
from os import listdir, walk
from os.path import isfile, join
# Read file names
files = []
for i in range(7):
directory = "Training/"+str(i)
ctr = 0
for (dirpath, dirnames, filenames) in walk(directory):
files.append(filenames)
len(files) # Verify number of folders
# Take 4 images for each class
subset = []
num_img = 4
for j in range(num_img):
for i in range(7):
directory = "Training/"+str(i)
filename = directory + "/" + files[i][j]
subset.append(filename)
len(subset) # Verify number of images = 7*4 = 28
# Function to plot the images in a grid
def plot(samples):
fig = plt.figure(figsize =(10, 10))
gs = gridspec.GridSpec(4, 7) # 28 images
gs.update(wspace = 0.05, hspace = 0.05)
for i, sample in enumerate(samples):
ax = plt.subplot(gs[i])
plt.axis('off')
ax.set_xticklabels([])
ax.set_yticklabels([])
ax.set_aspect('equal')
plt.imshow(sample.reshape(48, 48), cmap='Greys_r')
# Store 1 image from each class in an object
imgs = np.ndarray([28, 48, 48])
for i in range(len(subset)):
imgs[i] = plt.imread(subset[i])
print " Angry Disgust Fear Happy Sad Surprise Neutral"
plot(imgs) # Plot the images