A Landmark Detection and Iris Prediction Dataset for Gaze Tracking Research

Open Access
Conference Proceedings
Authors: Brett ThamanNicholas CaporussoTrung Cao

Abstract: Gaze tracking has become an established technology that enables detecting the position of the eyes and using it for estimating where the user is looking. In the last decades, gaze tracking has been realized primarily with dedicated devices utilizing infrared (IR) sensors, though the requirement of adopting specific hardware has limited gaze tracking and its use in potential large-scale applications. In the last decade, several research groups have pursued the development of gaze tracking solutions based on computer vision and traditional RGB cameras such as webcams embedded in portable computers and mobile devices. Unfortunately, previous studies have shown that gaze tracking systems based on RGB cameras have significantly lower accuracy, are not suitable for tasks that require precise user control, and require further research and development. Recently, TensorFlow released a landmark detection library that predicts the location of key points of the human face, including the position of the eyes. The algorithm outputs approximately 500 features in which each point is represented as a series of coordinates on a three-dimensional space. Although TensorFlow’s landmark detection algorithm could potentially be utilized for gaze tracking tasks, the number and complexity of its features make it unpractical to use it for real-time gaze prediction without further feature extraction and dimensionality reduction. In this paper, we introduce and discuss a dataset designed for stimulating screen-based gaze tracking research aimed at replacing traditional IR devices with standard RGB cameras. Our objective was to label the features estimated by TensorFlow’s landmark detection and iris prediction model with the actual location of the user’s gaze on a screen. To this end, we collected data from 30 users who were involved in gaze tracking tasks. Each sample in our dataset includes all the features of TensorFlow’s landmark detection and iris prediction model and two different labels representing (1) the actual gaze location acquired with a dedicated IR sensor, and (2) a reference point. In this paper, we detail the data collection software and procedure, we describe the dataset, and we discuss its potential use in advancing gaze tracking research.

Keywords: Human-Computer Interaction, Gaze Tracking, Artificial Intelligence

DOI: 10.54941/ahfe100924

Cite this paper: