Using human observer eye movements in automatic image classifiers

Abstract
We explore the way in which people look at images of different semantic categories and directly relate those results to computational approaches for automatic image classification. Our hypothesis is that the eye movements of human observers differ for images of different semantic categories, and that this information can be effectively used in automatic content-based classifiers. First, we present eye tracking experiments that show the variation in eye movements across different individuals for image of 5 different categories: handshakes, crowd, landscapes, main object in uncluttered background, and miscellaneous. The eye tracking results suggest that similar viewing patterns occur when different subjects view different images in the same semantic category. Using these results, we examine how empirical data obtained from eye tracking experiments across different semantic categories can be integrated with existing computational frameworks, or used to construct new ones. In particular, we examine the Visual Apprentice, a system in which images classifiers are learned form user input as the user defines a multiple level object definition hierarchy based on an object and its parts and labels examples for specific classes. The resulting classifiers are applied to automatically classify new images. Although many eye tracking experiments have been performed, to our knowledge, this is the first study that specifically compares eye movements across categories, and that links category-specific eye tracking results to automatic image classification techniques.

This publication has 0 references indexed in Scilit: