# Vectorization for Fast, Analytic, and Differentiable Visibility

**Author**: Yang Zhou, Lifan Wu, Ravi Ramamoorthi, Ling-Qi Yan

**Paper**: Link

**Video**:

**Presentation:**

## Introduction

There are generally two approaches to render images:

- Rasterization: Project scene geometry onto the screen and breaks the geometry into pixels
- Fast but prone to aliasing

- Ray Tracing: Cast rays into the scene and bounces them stochastically to find paths connecting the light and the camera
- High-quality but is slow and noisy

### Main Problem

Main problem for both approaches: point sampling

- The central point of each pixel is typically used to detect the coverage of geometry in rasterization, producing aliasing.
- The sampled paths in ray tracing are essentially points in the high dimensional pathspace, and such random point sampling of the path space introduces variance.
- Since the point samples are discrete, it is difficult to calculate the gradients of the rendering process with respect to scene parameters in order to enable differentiable rendering, especially with discontinuities that are commonly seen, such asboundaries of geometry and shadows.

### Main Contribution

- A novel vectorization pipeline different from both ray tracing and rasterization, eliminating point sampling in the 2D integrationdomain, producing fully analytic, accurate and anti-aliased point-to-region shading results.
- A hierarchical data structure for efficient, robust, depth-aware andorder-independent insertion and maintenance of polygons, and most important.
- A differentiable rendering framework for visibility using only automatic differentiation or finite differences without special treat-ments to the rendering pipeline, resulting in noise-free gradients, which enables second-order optimization methods in inverse rendering for the first time, and orders of magnitude faster performance than existing approaches.

## Related Work

**Beam Tracing**

**Visibility**

**Hidden Surface Removal**

**Differentiable Light Transport**

**Differentiable Rasterization**

**Automatic Differentiation**

**Vector Representation in Images**

## Overview

### The Point-to-Region Light Transport Problem

Many problems in light transport involve solving high dimensional integrals. For the direct illumination from area light sources, which can be formulated as a 4D region-to-region light transport integral:

$$

I=\int_{\mathcal P}W(x)\underbrace{\int_AL_e(\pmb y\rightarrow x)f_r(\pmb y\leftrightarrow x\leftrightarrow x_c)G(x\leftrightarrow \pmb y)V(x,\pmb y)\mathrm d\pmb y\mathrm dx}_{=:L(\pmb x)}

$$

- $\mathcal P$ : Pixel coverage (pixel footprint)
- $A$: the set of area lights
- $W(\cdot)$: The pixel reconstruction filter
- $L(x)$: The radiance received at a primary shading point $x$
- Integrate over contributions from area light sources
- $\pmb y$: A point on the area lights
- $L_e$: The emitted radiance
- $f_r$: The BRDF
- $x_c$: The camera position
- $G$: The geometry term
- $V$: The binary visibility function

Finding an analytic solution to the full 4D region-to-region integral is challenging. This paper presents a geometry vectorization method that can compute its 2D point-to-region sub-problem. The method is based on beam tracing, which traces beams from a shading point $x$ to area lights.

Reformulate the received radiance $L(x)$ as

$$

\begin{aligned}

L(x)&=\int_{\mathcal H^2}f_r(x,\omega_i,\omega_o)\left<\pmb n(x),\omega_i\right>V(x,\omega_i)\mathrm d\omega_i\\

&=\int_{Q(x)}L_i(x,\omega_i)f_r(x,\omega_i,\omega_o)\left<\pmb n(x),\omega_i\right>\mathrm d\omega_i

\end{aligned}

$$

- $\mathcal H^2$: The hemisphere domain
- $\omega_i=(y-x)/\|y-x\|$: The incoming light direction
- $L_i(x,\omega_i)=L_e(y\rightarrow x)$: The incoming radiance
- $\omega_o=(x_c-x)/\|x_c-x\|$: The view direction
- $\left<\cdot,\cdot\right>$: The clamped dot product
- $\pmb n(x)$: The shading normal
- $V(x,\omega_i)$: Indicate whether a shadow ray that starts at $x$ towards $\omega_i$ is occluded or not

Moving the visibility function from the integrand to the integration domain by defining $Q(x)=\{\omega|V(x,\omega)=1\}$ as a set of spherical polygons, which are formed by projecting all the visible regions of area lights onto the unit sphere centered at the shading point $x$

Key: computing all the visible regions of area lights $Q(x)$ within the beams using a binary Visibility Bounding Volume Hierarchy

### Analysis

#### Practicality

In practice, most of the 4D region-to-region light transport problem can be factored out into several point-to-region subproblems.

Use a hybrid approach to estimate the 4D region-to-region integral:

- Sample primary rays within a 2D pixel footprint $\mathcal P$
- Use adaptive sampling similar to MSAA
- Group similar regions inside each pixel into at most 4 groups, perform beam tracing once for each group

- Trace beams to compute the other 2D point-to-region integral

#### Difficulty

- Predicate: detect relationships between geometric primitives, including problems like detecting whether a point is on a line segment
- When implementing themwith floating-point numbers, the numerical inaccuracies are usually ignored or walk-arounded by 𝜖-tweaking in engineering

- Construction: such as the hidden surface removal problem, generate new geometric primitives based on predicates
- Require much stronger guarantee on numerical robustness because otherwise, errors will accumulate and will lead to invalid control flow

## Geometry Vectorization

This paper focus on evaluating 2D point-to-region integrals in the following form:

$$

I=L(x)=\int_{Q(x)}h(\omega)\mathrm d\omega

$$

- The integration domain $Q(x)=\{\omega|V(x,\omega)=1\}$ is the visible polygons projected to the unit sphere centered at $x$
- $h(\omega)$ represents a general spherical function whose explicit formulation depends on specific rendering application

Many problems in rendering, e.g. shading with area lights, can be formulated in this form of integral

- Usual approach to solve this integral is Monte Carlo integration
- For some specific spherical functions used inrendering such as Linearly Transformed Cosines(LTC), integrating them over spherical polygons yields analytic solutions, which is beneficial for noise-free anti-aliased rendering
- Therefore, the integral evaluation boils down to computing the visible spherical polygons $Q(x)$

### Computing Visible Polygons

Input: a set of 3D triangles $\{\Delta_i^{3D}\}_{i=1}^n$, a beam frustum

- Assume that every pair of triangles are either disjoint or sharing vertices and edges, i.e., there is no pentration
- The beam frustum is defined by a 4x4 perspective projection matrix

Therefore, we can transform all the 3D triangles according to the perspective projection matrix. After the projection, all the relevant triangles are in the normalized device coordinate (NDC) space, i.e., the $x$, $y$ and $z$ coordinates of triangle vertices are within $[-1,1]$. The occlusion between the original 3D triangles can be equivalently resolved using a set of 2D projected triangles $\{\Delta_i\}^n_{i=1}$ and their depths $\{z_i\}_{i=1}^n$

This paper use an interative approach to build a data structure $D=\{p_k\}^m_{k=1}$ recording all the visible polygons in 2D, as we add the triangles $\{\Delta_i\}_{i=1}^n$ one by one.

- Require all of the polygons $p_k$ to be convex
- Denote $D_i$ as the intermediate result after adding $\Delta_1,\Delta_2,\cdots,\Delta_i$
- Initially, $D_0$ only has the square $[-1,1]^2$, indicating the projection of the frustum viewport rectangle at the far plane
- The output is $D_n$, indicating all the 2D polygons with the minimum depths.
- They can be mapped to 3D visible polygons within the beam frustum by the inverse perspective projection transformation, and further projected on the unit hemisphere to get the integration domain $Q(x)$

Algorithm:

Details at the $i$-th iteration that updates $D_i$ by adding $\Delta_i$

**Intersecting**

- Given a new triangle $\Delta_i$, find the convex polygons $p_k\in D_{i-1}$ that overlap with $\Delta_i$, i.e., $\tilde p_k=\Delta_i\cap p_k\neq\phi$
- Build a 2D Visibility Bounding Volume Hierarchy (VBVH) to speed up the intersection test and maintain this datastructure as we add new triangles.
- Once find the convex polygons $p_k$ that overlap with $\Delta_i$, we can compute their overlapping regions $\tilde p_k$.
- Calculate the residual region $r_k=p_k\backslash \tilde p_k$ by 2D polygon differencing.
- The running time is linear in the number of polygon vertices, indicating that it belongs to the triangle $\Delta_{c_k}$

**Labeling**

- Each visible convex polygon $p_k$ in $D_{i-1}$ is associated with a label $c_k$
- For every overlapping region $\tilde p_k$, we compare the depth values of $\Delta_i$ and $p_k$ (restricting on $\tilde p_k$)
- Given the no-penetrating assumption, the new triangle $\Delta_i$ is either behind or in front of $p_k$

- Sometimes it is unnecessary to use a different label for every individual triangle, e.g., shading from area lights, the polygons representing occluders can share the same label

**Splitting**

- For every convex polygon $p_k$ in $D_{i-1}$, it can be decomposed into two disjoint regions $\tilde p_k$ and $r_k=p_k\backslash \tilde p_k$
- If the two regions have the same label, no changes have to made. Otherwise, need to remove the original polygon $p_k$ and add these two new regions $\tilde p_k$ and $r_k$ to the data structure
- $\tilde p_k$ is a convex polygon since it is the intersection of a convex polygon and a triangle
- $r_k$ may be non-convex, need to split it into multiple disjoint convex regions $r_k\bigcup_jr_{k,j}$

- The list of convex regions is updated as

$$

D_i=D_{i-1}-\{p_k\}+\{\tilde p_k\}+\{r_{k,j}\}

$$**Merging**

- If the new triangle $\Delta_i$ is completely visible, i.e., all the sub-regions $\tilde p_k$ (note that $\Delta_i=\bigcup_k\tilde p_k$) are visible and share the same label, we can just add $\Delta_i$ to $D_i$ by merging all of $\tilde p_k$

### Visibility Bounding Volume Hierarchy

- Leaf node: contain a visible convex polygon and its corresponding bounding box
- Interior node: store the union of the bounding boxes of its children

Support operations:

**Intersecting**

**Splitting**

**Merging**

## Application

### Anti-Aliased Primary Visibility

Primary visibility is used to determine the occupied area of visiblepolygons inside a pixel. For traditional rasterization or ray tracing methods, it is common to get aliased results due to the insufficient sampling rates, especially at the object’s boundary regions that have many sub-pixel geometric details.

For each pixel on the image plane, we create a beam that originates from the camera and goes exactly through the pixel. Assuming the pixel is a square of $[-1, 1]^2$, the pixel color is

$$

I=\int_{[-1,1]^2}\alpha(\pmb u)c(\pmb u)\mathrm d\pmb u

$$

- $\alpha(\pmb u)$: the binary function indicating whether a position $\pmb u$ inside the pixel is covered by the projected triangles
- $c(\pmb u)$ is the color of the triangle

Let $\{p_k\}^m_{i=1}$ be the set of 2D regions generated by previous visible polygon computation, and $\{c_k\}_{i=1}^m$ be the corresponding colors. The pixel color can be reformulated as a summation

$$

I=\sum_{k=1}^mc_k\underbrace{\int_{p_k}\mathrm d\pmb u}_{=:\mathrm{Area}(p_k)}

$$

- The region areas $\mathrm{Area}(p_k)$ can be evaluated analytically given the polygon vertices
- Able to compute the exact pixel colors and produce anti-aliased images

### Accurate Shading from Area Lights

Assume that the area light emits constant radiance $L_e$. The received radiance at a shading point $\pmb x$ from area lights is

$$

I=L(\pmb x)=L_e\int_{Q(\pmb x)}f_r(\pmb \omega_i,\pmb\omega_o)\left<\pmb n,\pmb\omega_i\right>\mathrm d\pmb\omega_i

$$

To compute $Q(\pmb x)$, we create a beam from the shading point $\pmb x$ to the area light sources.

Use previous visible polygon computation to find the visible regions representing the area lights. Then project them onto the unit sphere centered at $\pmb x$ by the inverse perspective projection, resulting in a set of spherical polygons $\{Q_k\}_{k=1}^m$. Then we can rewrite the shading integral as

$$

I=\sum_{k=1}^mL_e\underbrace{\int_{Q_k}f_r(\pmb\omega_i,\pmb\omega_o)\left<\pmb n,\pmb\omega_i\right>\mathrm d\pmb\omega_i}_{=:I_k}

$$

The integrals over spherical polygons $I_k$ can be approximated analytically using Linearly Transformed Cosines (LTCs). Fitting the cosine-weighted BRDF (the integrand of $I_k$) by an LTC with the transformation matrix $M$. The integral $I_k$ has an analytic solution:

$$

I_k=E(\tilde Q_k)=\frac{1}{2\pi}\sum_{j=1}^N\arccos(\left<\pmb q_j,\pmb q_{j+1}\right>)\left<\frac{\pmb q_j\times\pmb q_{j+1}}{\|\pmb q_j\times\pmb q_{j+1}\|},\hat{\pmb z}\right>

$$

- $E$: the irradiance of the polygon $\tilde Q_k=M^{-1}Q_k$
- $\pmb q_j$: the $j$-th vertex of the polygon $\tilde Q_k$, define $\pmb q_{N+1}=\pmb q_1$
- $\hat{\pmb z}=(0,0,1)$

### Anti-Aliased Shadows from Point Lights

The shading of a polygonal surface patch that is seen through a pixel and is lit by a point light can be formulated as a 2D integral:

$$

I=\int_{\mathcal P}W(\pmb x)f_r(\pmb \omega_i,\pmb\omega_o)L_i(\pmb x,\pmb\omega_i)\left<\pmb n(\pmb x),\pmb\omega_i\right>V(\pmb x,\pmb\omega_i)\mathrm d\pmb x

$$

- $\pmb \omega_i=(\pmb x_l-\pmb x)/\|\pmb x_l-\pmb x\|$: light direction
- $\pmb x_l$: point light position
- $\pmb x$: shading position

Assuming the footprint is small enough compared to the size of the scene, we can approximate by splitting the integral into two parts:

$$

I\approx\int_{\mathcal P}W(\pmb x)f_r(\pmb \omega_i,\pmb\omega_o)L_i(\pmb x,\pmb\omega_i)\left<\pmb n(\pmb x),\pmb\omega_i\right>\mathrm d\pmb x \cdot\int_{\mathcal P}V(\pmb x,\pmb \omega_i)\mathrm d\pmb x

$$

- The first integral $I_{\mathrm{unshadow}}$ represents the unshadowed contribution, where the integrand terms vary smoothly
- It usually can be evaluated easily by taking one or a few samples on the pixel footprint

- To compute the visibility integral in the second part $I_{\mathrm{vis}}$, we trace a narrow beam from the light that encloses the patch, and compute the regions that are visible to the point light $Q(\mathcal P)=\{\pmb x|V(\pmb x,\pmb \omega_i)=1\}$ by the paper’s vectorization method.

The second visibility integral becomes the area of the visible polygons:

$$

I_{\mathrm{vis}}=\int_{Q(\mathcal P)}\mathrm d\pmb x

$$

which has analytic solutions given polygon vertices

In practice, instead of computing the visible polygons $Q(\mathcal P)$ for each pixel footprint, usually precompute a vectorized shadow map that records all visible polygons from all directions as viewed from the light position.

### Pure Specular Reflection

### Differentiable Rendering

Vectorization analytically finds all visibility discontinuties, therefore, automatic differentiation just works.

Example: Primary visibility

$$

\begin{aligned}

I&=\int_{[-1,1]^2}\alpha(\pmb u)c(\pmb u)\mathrm d\pmb u\\

&=\sum_{k=1}^mc_k\mathrm{Area}(p_k)

\end{aligned}

$$

$$

\frac{\partial I}{\partial\theta}=\sum_{k=1}^mc_k\frac{\partial}{\partial \theta}\mathrm{Area}(p_k)

$$

## Result

## Conclusion

### Summary

- Analytic 2D point-to-region visibility
- Noise-free rendering
- Noise-free differentiable rendering

### Future Work

- Extend to higher dimensions
- 4D region-to-region visibility
- Global illumination

- More optimization
- VBVH refitting
- Exact vs. approximate