In-the-wild eye contact detection Typical scene for eye contact detection in the wild, where pedestrians might be far from the camera and heavily occluded. Our method estimates, from predicted body poses, whether people are paying attention (showed in green) through eye contact with a self-driving agent, or are distracted (showed in red). This information can then be used as an external modality to better forecast their behaviors and to reduce the risk of collision.
Prediction output from one of our trained models on our LOOK test set. By escaping from the 2D space of the image using low-dimensional information (human keypoints), the model is able to generalize over various scenarios and contexts
Dataset | Frames | Instances | Looking | Pedestrians |
---|---|---|---|---|
nuScenes | 2,216 | 13K | 9% | 7,100 |
KITTI | 1,391 | 4,630 | 17% | 425 |
JRDB | 9,441 | 39K | 18% | 399 |
LOOK | 13K | 57K | 16% | 7,944 |