Overview

The simplicity and efficiency of the voxel grid representation make it an appealing choice for 3D deep learning techniques. Because of this, and their compatibility with conventional neural network techniques, voxel grid representations appear quite often in 3D deep learning models such as Mesh R-CNN. It is also possible to relate 2D images to 3D volumes via unprojection and a rendering technique called ray marching. In practice, we can use camera calibration and unprojection to infer properties of 3D volumes based on 2D image sets to do things like fit density and color to the voxel grid for 3D reconstruction.

Differentiable volume rendering

Rendering a volume requires a completely different approach than rendering meshes. Volumes are a common solution in applications where simply modeling the surface of an object is insufficient to determine its appearance, such as clouds or medical images. Such objects are typically rendered by sampling the volume to estimate the path of rays traveling through them, which we can do in a fully differentiable manner.

Ray marching

When we lack explicit geometry, the rasterization and ray tracing techniques are not applicable to solving the visibility problem. Volumes are a case of implicit geometry, so we rely on a different technique called ray marching.

The ray marching technique casts rays through a scene and collects samples at points along the ray. For each pixel, a single ray that passes through the camera origin o\bf{o} and through a given pixel at the location r(t)r(t) is cast through the scene. This ray is typically expressed as:

Get hands-on with 1400+ tech skills courses.