Eccentricity-dependent Spatio-temporal Flicker Fusion | SIGGRAPH 2021

Brooke Krajancich, Petr Kellnhofer, Gordon Wetzstein

We derive an eccentricity-dependent spatio-temporal model of the visual system to enable the development of new temporally foveated graphics techniques.

SIGGRAPH 2021 - 3 Min Overview

ABSTRACT

Virtual and augmented reality (VR/AR) displays strive to provide a resolution, framerate and field of view that matches the perceptual capabilities of the human visual system, all while constrained by limited compute budgets and transmission bandwidths of wearable computing systems. Foveated graphics techniques have emerged that could achieve these goals by exploiting the falloff of spatial acuity in the periphery of the visual field. However, considerably less attention has been given to temporal aspects of human vision, which also vary across the retina. This is in part due to limitations of current eccentricity-dependent models of the visual system. We introduce a new model, experimentally measuring and computationally fitting eccentricity-dependent critical flicker fusion thresholds jointly for both space and time. In this way, our model is unique in enabling the prediction of temporal information that is imperceptible for a certain spatial frequency, eccentricity, and range of luminance levels. We validate our model with an image quality user study, and use it to predict potential bandwidth savings 7× higher than those afforded by current spatial-only foveated models. As such, this work forms the enabling foundation for new temporally foveated graphics techniques.

FILES

Technical Paper (arXiv link)
Supplement (pdf)
Dataset (zip)

**NOTE: Equation 3 and Table 3 have been corrected and thus may differ from previous versions.

CITATION

B. Krajancich, P. Kellnhofer, G. Wetzstein, “A Perceptual Model for Eccentricity-dependent Spatio-temporal Flicker Fusion and its Applications to Foveated Graphics”, in ACM Trans. Graph., 40 (4), 2021.

BibTeX
@article{Krajancich:2020:spatiotemp_model,
author = {Krajancich, Brooke
and Kellnhofer, Petr
and Wetzstein, Gordon},
title = {A Perceptual Model for Eccentricity-dependent Spatio-temporal Flicker Fusion and its Applications to Foveated Graphics},
journal = {ACM Trans. Graph.},
volume = {40},
issue = {4},
year={2021}
}

Measurement User Study

Custom VR display: To develop an eccentricity-dependent model of flicker fusion, we need a display that is capable of showing stimuli at a high framerate and over a wide field-of-view. Unable to find a suitable commercial display capable, we designed and built a custom VR display capable of meeting these requirements. As shown in the photograph, a neural density filter is used to reduce the brightness of a digital light projector (DLP) which projects onto a semitransparent diffuser, which serves as a projection screen. The user views the screen from the opposite side, where the magnifying optics are provided by a View-Master Deluxe VR Viewer with the back panel removed.

Gabor Wavelet Sampling: An eccentricity-dependent model that varies with spatial frequency must adhere to the uncertainty principle. That is, low spatial frequencies cannot be well localized in eccentricity. This behavior is appropriately modeled by wavelets. As such, we conduct a user study, sampling the flicker fusion thresholds across each user’s retina using a set of 2D Gabor wavelets. The above animation shows a few of these wavelets and their relative size, frequency and eccentricity.

Perceptual Limitations Vary Across the Retina

Foveated graphics approaches exploit the well-known drop in acuity, or ability to resolve spatial detail, towards the periphery of our visual field to reduce bandwidth requirements in AR/VR. Yet our critical flicker fusion thresholds (CFF), or sensitivity to changes in time, also varies across the retina, peaking in the mid-periphery. While this suggests that further bandwidth savings could be enabled by exploiting perceptual limitations in the temporal domain. However, the highest temporal frequency that we can perceive depends on the spatial frequency of the stimulus, its luminance and where it falls on the retina. Thus until now, the progress of new foveated graphics techniques have been hampered by the lack of a unified perceptual model that captures the varying spatio-temporal characteristics across the visual field.

A Unified Model

A Model for Eccentricity-dependent Flicker-fusion: We experimentally measure and then computationally fit user data to derive a unified model of critical flicker fusion (CFF). Similar to current foveated graphics systems, our model allows us to predict visual information that cannot be perceived in the spatial domain, but is unique in also indicating what temporal information may be imperceivable for a certain spatial frequency, eccentricity and display luminance (represented as the area above the surface displayed above).

Applications in Foveated Graphics

Enabling new temporally foveated approaches: Our model defines the gamut of visual signals perceivable to human vision. As such, we conduct a user study which shows that our model could help differentiate perceivable and non-perceivable spatio-temporal artifacts better than many existing perceptual video quality assessment metrics (e.g. PSNR, SSIM and VMAF). Furthermore, we provide a theoretical analysis of the compression gain factors this model may potentially enable for foveated graphics applications when used to allocate resources such as bandwidth.

Related Projects

You may also be interested in related projects from our group on perceptual aspects of near-eye displays:

R. Konrad et al. “Gaze-contingent Ocular Parallax Rendering for Virtual Reality”, ACM Transactions on Graphics 2020 (link)
B. Krajancich et al. “Optimizing Depth Perception in Virtual and Augmented Reality through Gaze-contingent Stereo Rendering”, ACM SIGGRAPH Asia 2020 (link)
N. Padmanaban et al. “Optimizing virtual reality for all users through gaze-contingent and adaptive focus displays”, PNAS 2017 (link)

and other next-generation near-eye display and wearable technology:

Y. Peng et al. “Neural Holography with Camera-in-the-loop Training”, ACM SIGGRAPH 2020 (link)
B. Krajancich et al. “Factored Occlusion: Single Spatial Light Modulator Occlusion-capable Optical See-through Augmented Reality Display”, IEEE TVCG, 2020 (link)
N. Padmanaban et al. “Autofocals: Evaluating Gaze-Contingent Eyeglasses for Presbyopes”, Science Advances 2019 (link)
K. Rathinavel et al. “Varifocal Occlusion-Capable Optical See-through Augmented Reality Display based on Focus-tunable Optics”, IEEE TVCG 2019 (link)
R. Konrad et al. “Accommodation-invariant Computational Near-eye Displays”, ACM SIGGRAPH 2017 (link)
R. Konrad et al. “Novel Optical Configurations for Virtual Reality: Evaluating User Preference and Performance with Focus-tunable and Monovision Near-eye Displays”, ACM SIGCHI 2016 (link)
F.C. Huang et al. “The Light Field Stereoscope: Immersive Computer Graphics via Factored Near-Eye Light Field Display with Focus Cues”, ACM SIGGRAPH 2015 (link)

ACKNOWLEDGEMENTS

B.K. was supported by a Stanford Knight-Hennessy Fellowship. G.W. was supported by an Okawa Research Grant, a Sloan Fellowship, and a PECASE by the ARO. Other funding for the project was provided by NSF (award numbers 1553333 and 1839974). The authors would also like to thank Brian Wandell, Anthony Norcia, and Joyce Farrell for advising on temporal mechanisms of the human visual system, Keith Winstein for expertise used in the development of the validation study, Darryl Krajancich for constructing the apparatus for our custom VR display, and Yifan (Evan) Peng for assisting with the measurement of the display luminance.