Image Quality
One item that makes NVidIA's implementation of FSAA cool is its ability to increase the 16-bit image quality. For example, consider you have a scene in Quake 3 with a rocket trail, and assume this scene is 16-bit. Now on looking at the scene, there are obvious visual defects, many of which are related to dither patterns. For example, looking at the rocket trail, the multiple renders of that trail are stacked on top of each other. Because of this, a clearly visible pattern forms and seriously detracts from the image quality. Well with FSAA, NVIDIA is able to get around this. Let me explain by quickly going through the process to get the output result.
- A 16-bit scene is rendered into a super-buffer, which includes dither patters.
- A 16-bit image is down-sampled to the output resolution. During this, the pixel colors from the additional samples in the super-buffer are averaged to form the color of each final pixel. This increases the pixel accuracy to something near 22-bit color and removes dither patters.
- A single dither pass is made and the image is dumped into a 16-bit buffer for displaying.
We thus find that by using FSAA, we remove the unwanted dithering artifacts normally associated with 16-bit rendering. Now it is important to realize that current NVIDIA drivers (specifically the 5.22 betas) do not support this. However, once NVIDIA makes an official driver release this feature will be enabled.
Linear Frame-Buffer Access
There is an ever present issue with super-sampling, what is known as linear frame-buffer access. See, some games like to mix 2D and 3D images to bring about a final image. To do this, they render the 3D scene and then, using linear frame-buffer writes, add a 2D image over top. A good example of this can be found in some flight sims with cockpits. This is done by blitting, or copying data one bit at a time from another memory location; in this case from system memory to frame-buffer memory. To understand this problem, consider you are running a flight sim at 800x600. Now assuming you force 4X AA, images are now rendered internally at 1600x1200 ((2*800) * (2*600)), but the game is still expecting a 800x600 image, because that is what it is set to. So the game selects coordinates based on an 800x600 image to place the cockpit, as it isn't aware of the image actually being at 1600x1200. So what basically happens is the cockpit is rendered at 1/4 the size it should be and in the top-left corner of the screen, because those are where the coordinates of a would-be 800x600 image place it. To get around this, NVIDIA places a lock on the super-buffer (so LFB writes can't occur) and down-samples the image to a normal sized buffer (so a buffer with a 800x600 image) where LFB writes are now free to take place. From here, linear frame-buffer writes can occur because the image being written to is exactly what the game is expecting. Here is a step-by-step overview of the process:
- Image is written to a super-buffer.
- A lock is placed on the buffer.
- Image is down-sampled to normal size.
- Linear frame-buffer write occurs.
- Image is flipped and displayed.
Now some games like to render 3D, do a LFB write, and then do some more 3D rendering. The problem with this is that you're trying to render 1600x1200 information to an 800x600 image. To get around this, NVIDIA takes their down-sampled 800x600 image and up-samples it back to 1600x1200 and proceeds to do any additional 3D rendering. Here is what takes place:
- Image is written to a super-buffer.
- A lock is placed on the buffer Image is down-sampled.
- Linear frame-buffer write occurs.
- Image is up-sampled to the original size.
- Additional rendering takes place.
- Image is down-sampled to output resolution.
- Image is flipped and displayed.
This gets interesting because this type of thing can happen multiple
times. So you might have a LFB write, rendering, LFB write, etc. This
generally doesn't happen in a game too often, but it can. The image is
up-sampled for additional 3D rendering, down-sampled and the LFB write
takes place. This, as expected though, will cause a performance hit, which
will generally be pretty small. In the most extreme of circumstances,
it can take a considerable performance hit, but this is not likely to
take place. It really just depends on how many times they need to up-sampled.