MantissaCam for Snapshot HDR Imaging | ICCP 2022

Haley So, Julien Martel, Piotr Dudek, Gordon Wetzstein

Learning Snapshot High-dynamic-range Imaging with Perceptually-based In-pixel Irradiance Encoding

This project won the Best Demo Award at ICCP 2022!

ABSTRACT

The ability to image high-dynamic-range (HDR) scenes is crucial in many computer vision applications. The dynamic range of conventional sensors, however, is fundamentally limited by their well capacity, resulting in saturation of bright scene parts. To overcome this limitation, emerging sensors offer in-pixel processing capabilities to encode the incident irradiance. Among the most promising encoding schemes is modulo wrapping, which results in a computational photography problem where the HDR scene is computed by an irradiance unwrapping algorithm from the wrapped low-dynamic-range (LDR) sensor image. Here, we design a neural network–based algorithm that outperforms previous irradiance unwrapping methods and, more importantly, we design a perceptually inspired “mantissa” encoding scheme that more efficiently wraps an HDR scene into an LDR sensor. Combined with our reconstruction framework, MantissaCam achieves state-of-the-art results among modulo-type snapshot HDR imaging approaches. We demonstrate the efficacy of our method in simulation and show preliminary results of a prototype MantissaCam implemented with a programmable sensor.

FILES

Technical paper (arxiv)
Code (coming soon)

CITATION

H. So, J. Martel, G. Wetzstein, MantissaCam: Learning Snapshot High-dynamic-range Imaging with Perceptually-based In-pixel Irradiance Encoding, IEEE International Conference on Computational Photography (ICCP), 2022.

@inproceedings{so2021,
author = {Haley So and Julien Martel and Gordon Wetzstein},
title = {MantissaCam: Learning Snapshot High-dynamic-range Imaging with Perceptually-based In-pixel Irradiance Encoding},
booktitle = {IEEE International Conference on Computational Photography (ICCP)},
year={2022}
}

MantissaCam electronically encodes the irradiance incident on the sensor into an LDR image by wrapping the intensity in a perceptually inspired manner (left). The proposed reconstruction algorithm estimates the HDR scene from this LDR image (center) and achieves accurate reconstructions compared to the ground truth (right).

Log histogram of normalized irradiance values of all pixels in our training and test sets of HDR images for all color channels (top). This histogram is highly biased towards low-intensity values, indicating that irradiance values of natural images are not uniformly distributed. Yet, the modulo encoding subdivides this intensity range uniformly and wraps each of these areas into the available dynamic range of the sensor, as shown for a 1D ramp (center). The proposed mantissa encoding wraps the same 1D ramp in a perceptually more uniform manner in log space, which is observed as non-uniform wrapping in irradiance space (bottom).

Example showing an HDR Gaussian function wrapped using the modulo and mantissa encoding in an LDR image. For this example, the modulo encoding requires more wraps than the mantissa encoding, which makes its reconstruction via computational unwrapping more challenging.

MantissaCam pipeline. An HDR scene is imaged by a camera with in-pixel processing capabilities, implementing the proposed irradiance encoding scheme (left). The resulting LDR sensor image encodes lower irradiance values similar to a conventional camera, but bright image regions, including the lamp and the reflections on the ground, are wrapped rather than saturated (center). The mantissa-encoded image is first processed by a network that predicts the wrap edges and then by another network that predicts the winding number (center right). The per-pixel winding number, together with the mantissa-encoded image, are used to reconstruct the HDR image (right).

Evaluation of encoding and decoding schemes in simulation. A conventional modulo encoding wraps the irradiance of a scene into an LDR sensor image (column 1). A graph cuts-based reconstruction algorithm [66] usually performs poorly (column 2) whereas the recently proposed UnModNet architecture [67] often estimates reasonable HDR images (column 3). Yet, the proposed reconstruction framework works best among these methods (column 4). Moreover, the proposed mantissa encoding scheme (column 5) induces fewer irradiance wraps making it easier to reconstruct the HDR image using our framework (column 6). Our approach achieves reconstructions closest to the ground truth (column 7). `P’, `S’, and `Q’ indicate the PSNR, SSIM and Q-score for each reconstruction method.

Quantitative evaluation of modulo and mantissa in-pixel encoding combined with various reconstruction algorithms. Our irradiance unwrapping network performs better than existing algorithms on the modulo encoding, as evaluated by several metrics. Combined with the proposed mantissa encoding, our approach achieves state-of-the-art results. We also show the quality of a CNN working with conventional LDR images using the same dataset. Values marked with * are reproduced from [67].

Prototype camera capturing the still life HDR scene.

Experimental results. Using a programmable sensor, SCAMP-5, we obtain a ground truth HDR image of this scene from bracketed exposures (left and center). We prototype the proposed mantissa irradiance encoding scheme that wraps the irradiance into the LDR sensor image (center right) and reconstruct the HDR image (right) with the proposed algorithm.

Related Projects

You may also be interested in related projects focusing on programmable sensors and snapshot HDR imaging:

Martel et al., Neural Sensors. IEEE Trans. PAMI / ICCP 2020 (link)
Vargas et al., Time-Multiplexed Coded Apertures. ICCV 2021 (link)
Wetzstein et al., Inference in artificial intelligence with deep optics and photonics (link)