Adaptive Incident Radiance Field Sampling and Reconstruction Using Deep Reinforcement Learning

Contribution

Addresses the light-field sampling and reconstruction problem with deep learning techniques and offline datasets.
Proposes a novel R-network that explores the image and direction spaces of the radiance field to effectively filter and reconstruct the incident radiance field.
Presents a novel RL-based Q-network to guide the adaptive rendering process.

Image-Space Methods
Light field Reconstruction Methods
Light field Adaptive Sampling Methods
Filtering using DNN
DRL

Representation of Incident Radiance Field

4D incident radiance field
- Image spaces: the space of a pixel or a shading point
- Direction spaces: the space of an incident hemisphere centered on the average normal of a group of shading points
Radiance field blocks
- Partition the image space (pixels) into tiles
- Partition the direction space into bins
A radiance block $B^j$ is defined by a bounded domain of the direction space and can be expressed as:
$$
B^j\mathop=^{\Delta}\{\theta,\phi;0\leq\theta_0^j\leq\theta\leq\theta_1^j<\frac{\pi}{2},0\leq\phi_0^j\leq\phi\leq\phi_1^j<2\pi\}
$$
- $[\theta_0^j,\theta_1^j]$ and $[\phi_0^j,\phi_1^j]$ are the elevation and azimuth angle bounds of $B^j$
- The j-th partition in the direction space at the particular pixel $\pmb x$ is $B_{\pmb x}^j$
- The j-th partition in the direction space shared by all pixels in the tile is $B_T^j$
Guided by the Q-network
- Adaptively partition the direction space into a radiance field hierarchy with nodes of various sizes
- Recursively partitioning the azimuth angle $\theta$ and the cosine weighted zenith angle $\theta$ into half
Reconstruct an incident radiance field per tile
- As the inputs of the networks
- The hierarchy is built in the hemisphere of the local frame of a tile
  - Defined from the average normal of pixels in the tile
Project the result to the individual frames of the pixels
- Special case: the hemisphere of an individual pixel has certain incident directions (i.e., uncovered domains) that are not covered by the hemisphere of the average local frame exists
  - Lead to poor runtime performance
  - Assign a uniform PDF to each uncovered domain for unbiased sampling

Radiance Field Reconstruction Using the R-network

Radiance field reconstruction $\mathcal N$ of an incident radiance block $B^j$ is

$$
\hat L_{in}^{B^j}(\pmb x)=\mathcal N(\pmb X,\Xi;\pmb w)
$$
- $\hat L_{in}^{B^j}(\pmb x)$: the output of the network
  - The average incident radiance in $B^j$ at pixel $\pmb x$
- $\pmb X$: the incident radiance sample in the domain of $B^j$
- $\Xi$: indicate auxiliary features
  - e.g., the position, normal and depth
- $\pmb w$: the trainable weight and bias term of $\mathcal N$

Filtering 4D Radiance Space

Challenges
- The number of samples per radiance block is smaller than the number of samples per pixel because one pixel has more than one block
- Due to the curse of dimensionality, performing convolutions in a 4D space requires higher menory, training time, and data

R-Network

Differences compared to image-space filtering
- Samples are disperesd in many directions
  - Inputs are sparse at individual radiance field blocks
- Direct convolution in the 4D light-field space requires high memory and computation
Four different CNNs
- Image network
  - Use image-space auxiliary features and performs image-space convolution
- Direction network
  - Works with features and convolution in the direction space
- Image-direction network (final R-network)
- Direction-image network

Image-Direction Network

The image-direction network can be partitioned into image and direction parts
- $\pmb X^i_\Gamma$: feature map associated with directional block $i$ and pixel tile $\Gamma$
- The image part takes some image-space auxiliary feature maps $\pmb G_\Gamma$ (i.e., surface normals, positions, and depth) and radiance feature maps $\pmb R^j_{\Gamma}$ (mean, variance and gradient of the radiance) as inputs
- The output is the direction-space feature map $\pmb F_{\pmb d_\Gamma}^j$
  - Learn from the image part as input to simultaneously convolve the radiance predictions of all radiance field blocks
The geometrical features have a total of 26 channels as follows:
- Three channels for the average normal, one channel for the average variance in the normals, and six channels for the gradients of the average normal
- Three channels for the average position, one channel for the average variance in the positions, and six channels for the gradients of the average position
- One channel for the average depth, one channel for the variance in the depth, and two channels for the gradients of the average depth
- Two channels for the gradients of the average radiance of all blocks
The radiance features have a total of four channels, which comprise:
- Three channels for radiance
- One channel for average variance in the radiance

Experiments on Reconstruction Networks

DRL-based Adaptive Sampling

Sample distribution and radiance field resolution greatly influence the results
- A higher number of samples provides richer information for even using well-trained denoising CNNs
- Adaptively refining the radiance field is a commonly used strategy to preserve lighting details with a limited budget
Propose the use of the DRL-based Q-network to guide the sampling and refinement of the radiance field hierarchy
- Use DRL to train the network: attempt to cover all the possible radiance field hierarchies and sampling distributions to search for GT are impractical
- Treat adaptive sampling as a dynamic process that iteratively takes action to refine radiance field blocks into smaller blocks or to increase the number of samples
- The trained Q-network evaluates the value of each action at the runtime to guide the adaptive process
Two factors are critical when building the hierarchy:
1. The structure of the hierarchy, i.e., the method for discretizing the radiance field
  - Noted in a previous adaptive method: a higher grid resolution can effectively capture high-frequency lighting features, but it comes with an overhead
2. An adaptive sample distribution
  - More samples (i.e., a greater sample density) are placed in those noisy areas (blocks or nodes) to reduce reconstruction errors

Deep Q-Learning

Input states: the global radiance field information (e.g., geometry information, radiance samples and radiance field hierarchy)
Output: predicts the quality value (Q-value) of possible actions as the output to determine the next action
Action:
1. Resample the block by doubling its sample density per block (the number of samples per block)
  - Decrease the variance in the radiance feature, which suppresses noise
2. Refine the radiance field block to 4x4 new blocks by equally partitioning each axis, while keeping the average number of samples per block by adding some new samples
  - The goal of maintaining the sample density per block is to prevent the degeneration of the reconstruction quality due to the sparser samples
  - Increase the resolution of the grid, which can capture high-frequency details
The quality value $Q$ and reward of action $r$ are defined in each radiance field block. For radiance field block $B^j$ at a pixel, the quality value of taking action $a$ in a state $s^j$ is defined by Bellman equation:
$$
Q^j(s^j,a)=r(s^j,a)+\gamma\max_{a’}Q^j(s^{‘j’},a’)
$$
- $s^j$ and $s^{‘j}$ correspond to the states before and after action $a$ is taken
- $a’$ is a possible next action
- $r(s^j,a)$ denotes the reward of the action
- $\gamma$ is a decay parameter between 0 and 1
Estimate the Q-value of an action:
- Approximate equation as follows:
  $$
  Q^j(s^j,a)\approx r(s^j,a)+\gamma\max_{a’}r(s^{‘j},a’)
  $$
  Define the reward $r(s^j, a)$ as follows:
  $$
  r(s^j,a)=E^j(s^j)-E^j(s’^j)
  $$
  - $E^j(s^j)$: the reconstruction error of block $B^j$

Reinforcement Learning Process

Q-network Structure

Adaptive Sampling and Rendering

The adaptive sampling and the rendering pipeline contain three steps given the trained networks:

Use the trained Q-network to guide the process of adaptive sampling and refine the field blocks
- Result in a hierarchy of radiance field blocks
Use the trained R-network to reconstruct the incoming radiances from the hierarchy
Apply the reconstruction result for the final rendering

Adaptive Sampling Algorithm

Reconstruction and Final Rendering

Use the image-direction R-network to reconstruct the radiance field blocks of the hierarchy $H_T$
- Generate a fast preview: simply use the reconstructed incident radiance field to evaluate and integrate the product of the incident radiance and the BRDF
- Render the unbiased image: treat the reconstructed radiance field and the BRDF as two PDFs to generate the sampling directions, combine those two samplers via MIS
  - The BRDF samples can be analytically drawn from a cumulative distribution function
  - The reconstructed radiance field samples are generated by initially selecting a radiance field block from a discrete PDF and then proportionally sampling a point from the block according to the cosine weighting term
As multi-bounce vertices are sparse, they are not compatible with the input format of the networks
- Switch to standard multiple importance sampling
- Adapt to other photon guided methods if the lighting is complex

Result

Limitations

BRDF term is not considered
Focus on first-bounce radiance field reconstruction

Adaptive Incident Radiance Field Sampling and Reconstruction Using Deep Reinforcement Learning

Contribution

Related Work

Representation of Incident Radiance Field

Radiance Field Reconstruction Using the R-network

Filtering 4D Radiance Space

R-Network

Image-Direction Network

Experiments on Reconstruction Networks

DRL-based Adaptive Sampling

Deep Q-Learning

Reinforcement Learning Process

Q-network Structure

Adaptive Sampling and Rendering

Adaptive Sampling Algorithm

Reconstruction and Final Rendering

Result

Limitations