Homogeneous Coordinates
Learn what homogeneous coordinates are and how they enable transformations and projection.
Overview
The camera coordinate space simply provides us a camera centric coordinate space with which to describe the distances of visible objects away from the camera origin (i.e., the camera). It is just like any other Euclidean coordinate space. However, this coordinate space is still not ideal when we wish to understand how our 3D coordinates project to images. Before we understand the intricacies of projection, we first need to understand homogeneous coordinates and so-called projective geometry.
Projection and scale
Take a look at this picture of a small dog.
Notice that line A and line B both look the same length, even though we know that line A, in reality, is much longer than line B. Projection has the effect of compressing distant things and enlarging close things. We know this intuitively when we move in closer to take a photograph of a friend or back up to fit more of a room in the frame.
If you recall the pinhole camera model from earlier, notice how we sort of have two inverted pyramids on either side of the aperture. The shape of this space is what we call the camera frustum, essentially an inverted pyramid expanding outward into our scene. This frustum defines the exact limits of what is visible in our scene.
This frustum also illustrates the projection effect: picture a single ray drawn from a pixel in our image plane through the aperture and out into the world. Notice how the ray gets further and further from the optical axis as it goes out into the world, and yet it is still projecting down to the same pixel.
The homogeneous coordinates
For any 3D point to which we intend to apply projection, we can add a 4th dimension that we call the homogeneous term. Our