Frustum Culling in Flash 10

Update: corrected a few glitches in the bounding sphere creation routine.

Optimization is always important. But when it comes to 3D for the Flash Platform, it’s an everyday battle. The first ideas that come to mind are to avoid:

  1. redrawing the same regions : each pixel value must be set once and only once
  2. rendering invisible objects : objects that are out of sight still take a lot of CPU horsepower

While Flash takes care of the first one in its very renderer, the second one is not handled. But that is something we can easily address!

The Technic

The method is called “frustum culling”. The big picture is that every mesh is approximated using a bounding volume (typically a sphere or a box). If the bounding volume is visible, the corresponding mesh is rendered. The two following pictures show the frustum culling caught in action:

The mesh is visible
The mesh is visible: TPS counter indicates 18900 triangles per seconds
Frustum culling in action (TPS counter indicates 0!)
The mesh is out of sight: frustum culling is acting and TPS counter indicates 0!

1. The frustum

First things first: before performing frustum culling, you need to build the frustum! The frustum is the 3D shape that contains everything the camera can see. It’s made of 6 planes and looks like a truncated pyramid:

The view frustum
The view frustum

Each plane is defined by a plane equation: ax + by + cz + d, where (a, b, c) is the normal of the plane and d the distance on that normal. The 6 planes must be extracted from one of the Matrix3D object used to define the transform from one space to another (world to camera for example). Here are the 6 plane equations:

  • left: (a = m14 + m11, b = m24 + m21, c = m34 + m31, d = m44 + m41)
  • right: (a = m14 − m11, b = m24 − m21, c = m34 − m31, d = m44 − m41)
  • bottom: (a = m14 + m12, b = m24 + m22, c = m34 + m32, d = m44 + m42)
  • top: (a = m14 − m12, b = m24 − m22, c = m34 − m32, d = m44 − m42)
  • near: (a = m13, b = m23, c = m33, d = m43)
  • far: (a = m13, b = m23, c = m33, d = m43)

The source matrix you chose defines the coordinates systems of the extracted planes:

  • the projection matrix provides planes in view (or camera) space
  • the view * projection matrix provides planes in world space
  • the world * view * projection matrix provides planes in object (or local) space
2. Bounding volumes

Testing every polygon against the frustum is a very expensive task. In order to speed things up, we will rather test a primitive shape that bounds the mesh: a “bounding volume”. Common bounding volumes are spheres or boxes. I’ll focus on the first one in the rest of the article.

A bounding sphere is pretty simple to implement: you just need to store the center of the sphere and it’s radius. That being said, a BoundingSphere class would just contain the center as a Vector3D and the radius as a Number. When you have all the vertices of a mesh, building the bounding sphere is easy :

3. Point to plane distance

The plane equation (ax + by + cz + d) can be stored as a Vector3D object. Indeed, the Vector3D implements x, y and z but also a w Number attribute. This last one is supposed to store the homogenous coordinate. Nonetheless, we can use those 4 Number attributes to store the (a, b, c, d) values that defines our plane equation.

In order to test whether a bounding volume is inside or outside the frustum, we must be able to compute the distance of this object to the plane. To do so, we use the dot product of the point with the plane normal (which must be normalized first) :

4. The test function

Now that our frustum and our bounding volumes are ready, we can add this little test function that checks whether an object is visible or not:

The Experiment

The following experiment demonstrates frustum culling in action (click on the picture to open the application) :

Frustum culling experiment (use the arrow keys to move)

Use the arrow keys to move the robot across the scene. You can notice the TPS counter drops down to zero whenever the character is out of sight (on the near left or near right side of the screen). When the TPS indicates 0, no triangles are rendered and CPU usage drops down to 0%. It actually means that the rest of the rendering process has a minimal overhead!

And it’s even more awesome…

This experiment also introduce my brand new mini-debugger inspired of Mr. Doob’s work. I replaced the maximum memory footprint counter with a per-second rendered triangles counter (or TPS counter). CPU usage or framerate alone are not absolute measurements but the TPS counter reflects the true performance of the application the old fashioned way: the more polygons per second the better. I also updated the timer to display the processing time and the overall rendering time.

You can compare this experiment to the “old” one I made up a few weeks ago: the displayed 3D model features 30% more polygons but the CPU usage is the same.

The rendering process is done in such a way that absolutely no computation occurs on invisible objects, not even the slightest animation or z-sorting operation.

I enabled dynamic bounding volumes for animated meshes: if you place the mesh on the edges of the viewing space, you might notice that the TPS counter sometimes goes down to zero to raise up again to a few hundreds/thousands of triangles per second. The explanation is simple: the visibility test is done dynamically and follows the mesh animations. If the animation drives the mesh out of the viewing frustum, culling operates and it is not renderered.