Gaze following, i.e., detecting the gaze target of a human subject, in 2D images has become an active topic in computer vision. However, it usually suffers from the out of frame issue due to the limited field-of-view (FoV) of 2D images. In this paper, we introduce a novel task, gaze following in 360-degree images which provide an omnidirectional FoV and can alleviate the out of frame issue. We collect the first dataset, "GazeFollow360", for this task, containing around 10,000 360-degree images with complex gaze behaviors under various scenes. Existing 2D gaze following methods suffer from performance degradation in 360-degree images since they may use the assumption that a gaze target is in the 2D gaze sight line. However, this assumption is no longer true for long-distance gaze behaviors in 360-degree images, due to the distortion brought by sphere-to-plane projection. To address this challenge, we propose a 3D sight line guided dual-pathway framework, to detect the gaze target within a local region (here) and from a distant region (there), parallelly. Specifically, the local region is obtained as a 2D cone-shaped field along the 2D projection of the sight line starting at the human subject’s head position, and the distant region is obtained by searching along the sight line in 3D sphere space. Finally, the location of the gaze target is determined by fusing the estimations from both the local region and the distant region. Experimental results show that our method achieves significant improvements over previous 2D gaze following methods on our GazeFollow360 dataset.
The dataset is available for free only for research purposes. We greatly welcome emails about questions or suggestions if you find some confusing things or mistakes. Please email to lyhsjtu@sjtu.edu.cn. Please cite this paper if you use the dataset. Dataset link: Baidudisk ( https://pan.baidu.com/s/1aTVH6WH3pcz6OysQ4Tmo7g ,passward: njqk)
•Spherical distance: we use the spherical distance, a.k.a., great-circle distance, as our evaluation metric intead. It is the shortest distance between two points on the surface of a sphere. •AUC: we use the Area Under Curve (AUC) criterion to assess a predicted gaze target heatmap. For fair comparsion, all the predicted heatmap are upsample/downsample to a 64*64 heatmap and are compared a heatmap of the same size with kernel size 3 to calculate the AUC score.
@inproceedings{li2021looking,
title={Looking here or there? gaze following in 360-degree images},
author={Li, Yunhao and Shen, Wei and Gao, Zhongpai and Zhu, Yucheng and Zhai, Guangtao and Guo, Guodong},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
pages={3742--3751},
year={2021}
}