Optimizing VR with Gaze-contingent Focus Displays | PNAS 2017

Nitish Padmanaban, Robert Konrad, Tal Stramer, Emily A. Cooper, Gordon Wetzstein

A gaze-contingent (varifocal) near-eye display technology with focus cues and an evaluation for how to make VR displays with focus cues accessible for users of all ages.


From the desktop to the laptop to the mobile device, personal computing platforms evolve over time. Moving forward, wearable computing is widely expected to be integral to consumer electronics and beyond. The primary interface between a wearable computer and a user is often a near-eye display. However, current generation near-eye displays suffer from multiple limitations: they are unable to provide fully natural visual cues and comfortable viewing experiences for all users. At their core, many of the issues with near-eye displays are caused by limitations in conventional optics. Current displays cannot reproduce the changes in focus that accompany natural vision, and they cannot support users with uncorrected refractive errors. With two prototype near-eye displays, we show how these issues can be overcome using display modes that adapt to the user via computational optics. By using focus-tunable lenses, mechanically actuated displays, and mobile gaze-tracking technology, these displays can be tailored to correct common refractive errors and provide natural focus cues by dynamically updating the system based on where a user looks in a virtual scene. Indeed, the opportunities afforded by recent advances in computational optics open up the possibility of creating a computing platform in which some users may experience better quality vision in the virtual world than in the real one.


Link to PNAS site


N. Padmanaban, R. Konrad, T. Stramer, E.A. Cooper, G. Wetzstein. “Optimizing virtual reality for all users through gaze-contingent and adaptive focus displays”, Proceedings of the National Academy of Sciences, 2017.


author = {N. Padmanaban and R. Konrad and T. Stramer and E.A. Cooper and G. Wetzstein},
title = {Optimizing virtual reality for all users through gaze-contingent and adaptive focus displays},
journal = {Proceedings of the National Academy of Sciences},
year = {2017},
URL = {http://www.pnas.org/content/early/2017/02/07/1617251114.abstract}



We would like to thank Intel, Huawei, the Okawa Foundation, NSF, Samsung, and Google for generously supporting this work.




A schematic showing the basic components of a wearable gaze-contingent prototype. The stepper motor mounted on top rotates based on position reported from the integrated stereoscopic eye tracker and software, moving the phone back and forth and thereby dynamically adjusting the focal plane of the display (red arrows).



Photographs of our wearable gaze-contingent display prototype. A conventional near-eye display (Samsung Gear VR) is augmented by a stereoscopic gaze tracker and a motor that is capable of adjusting the physical distance between screen and lenses.


Only a few mm of physical display displacement results in a large change of the perceived virtual image. Overall system latency, including rendering, data transmission, and motor adjustments, are approx. 280 ms for a sweep from 4 D (25 cm) to 0 D (optical infinity) and 160 ms for a sweep from from 3 D (33 cm) to 1 D (1 m). This latency is the in same order as the response time of the human accommodative system, which is a few hundred ms.


Gaze_contingent_table_setup(top) A typical near-eye display uses a fixed focus lens to show a magnified virtual image of a microdisplay to each eye. The focal length of the lens, f, and the distance to the microdisplay, d’, determine the distance of the virtual image, d. Adaptive focus can be implemented using either a focus-tunable lens (green arrows) or a fixed focus lens and a mechanically actuated display (red arrows), so that the virtual image can be moved to different distances. (bottom) A benchtop setup designed to incorporate adaptive focus via focus-tunable lenses and an autorefractor to record accommodation. A translation stage adjusts intereye separation, and NIR/visible light beam splitters allow for simultaneous stimulus presentation and accommodation measurement.



Prototype stereoscopic near-eye display with focus-tunable lenses and adjustable interpupillary distance via a translation stage. The systems includes an autorefractor that is capable of recording the accommodative state of the user’s right eye continuously at 4–5 Hz. Insets show example stereoscopic views.



(top) The use of a fixed focus lens in conventional near-eye displays means that the magnified virtual image appears at a constant distance (orange planes). However, by presenting different images to the two eyes, objects can be simulated at arbitrary stereoscopic distances. To experience clear and single vision in VR, the user’s eyes have to rotate to verge at the correct stereoscopic distance (red lines), but the eyes must maintain accommodation at the virtual image distance (gray areas). (bottom) In a dynamic focus display, the virtual image distance (green planes) is constantly updated to match the stereoscopic distance of the target. Thus, the vergence and accommodation distances can be matched.



Accommodative responses were recorded under conventional and dynamic display modes while users watched a target move sinusoidally in depth. The accommodative gains plotted against the user’s age show a clear downward trend with age and a higher response in the dynamic condition. Inset shows means and SEs of the gains for users grouped into younger and older cohorts relative to 45 y old.