I will conclude this article with some general comments on the emails I have received:

Several people have pointed me in the direction of the Hold and Modify mode of the good old' Amiga. This home computer was able to show 4096 different colors using only 6 bits (or was it 8 - some conflicting info) of storage per-pixel. Normally, you would expect only 64 possible colors. The Amiga achieved this increase in color depth by using "differential" coding. Basically, instead of providing a complete new description of the color of a specific pixel, you describe the difference with the previous one. So instead of saying that pixel at position (x,y) has a color RGB, you would say this pixel has the same color as the previous pixel but with a bit more Green. These modifications were handled by using some identification bits for every pixel. 2 of the 6 bits where used to describe what the next 4 bits meant :

    00 : use the next 4 bits to index the color lookup table and display that color
    01 : take the previously displayed color, but modify the red component with the        next 4 bits
    10 : same as above, but modify green
    11 : same as above, but modify blue

Some people suggested that this might be what 3dfx does, but with a bit more complexity. If this technique is being used by 3dfx, the data inside the frame buffer is very weird. Actually, data like this would require special post-processing by all capture programs because if that's not done, you would end up with a completely corrupted image. Some values in the buffer would be pointers to a color while others values would describe a difference. While it is not entirely impossible, I doubt that this technique is being used.

In my original article I suggested a one dimensional filter because two dimensional filters require an additional line cache. Many didn't really understand why I came to that conclusion, so let me explain a bit more. A one dimensional filter only uses data along the X-axis which is the path that is normally scanned for the RAMDAC. A two dimensional filter requires data from 2 scan lines and thus requires buffering of data. Today's accelerators support resolutions up to 1600x1200, so if you need a line cache you require a buffer of 1600 (width of the screen) x 2 bytes (16-bit color) = 3200 bytes or about 3.2Kbytes. A 3.2 Kb cache is not completely impossible, but you have to keep in mind that a texture cache is often only 8 Kb large, so it doesn't make sense that 3dfx would really include an extra 3.2K cache just to upsample to 22-bit color. Still, its not impossible. It all depends on how much room you have left on your chip surface.

Many of the emails I received were about this:

" ... Just give us that option to select 32 bits - if I find that the fps drops too much I can always lower it to 16 bit. On the other hand, I may not mind a drop in fps if I find that the quality of 32 bit is superior to 16 bit (or 22 bit). The point is this: let us gamers have our choice - WE DECIDE WHAT IS BEST FOR US - not 3dfx ramming 16 bit (or 22 bit) down our throats and not offering us that option. The only reason I see why 3dfx is not doing this (allowing 32bit) is simply that THEY DON'T HAVE IT - they are not capable of producing 32 bit. They have been milking the same old engine while they were king of 3d - and have remained too complacent so much so that their competitors have overtaken them and when they realized it, they came out with lots of lame excuses for this 22 bit bullshit... " (Excerpt from an email)

This opinion seems to be quite widespread on the Internet and I think it's partly true. 3dfx has tried on various occasions to downplay the importance of 24/32-bit color by claiming that today's applications don't need this color depth. Looking at the many screenshot comparisons on the 'net, I must conclude that they are right, many of those comparisons have to resort to zoomed-in shots to show the minuscule differences between Voodoo3 and TNT2. Does this mean that 16/22-bit is enough? Well, for today's games, yes. But for tomorrow's games, who knows? Right now games like Unreal and Quake2 where designed with Voodoo1 and 2 in mind, so it's only normal that these games look more than OK on Voodoo3. But tomorrow's games are no longer designed with only Voodoo 1/2/3 in mind. They are being designed with a horde of cards such as TNT, TNT2, ATI Rage, PVR250, S3 Savage 3D/4, and G200/400 in mind and those games might be able to show the impact of 24/32-bit rendering. With today's games, I just don't see any huge problems, except maybe for this next item.

Many emails pointed out that I didn't mention multiple dithering in my article about 22-bit. I did write about this in the article about dithering, though. The problem of multiple dithering appears when the Voodoo1/2/3 cards have to read info back from the frame buffer and re-use it. For example, you would render a scene (render and draw in dithered form to the frame buffer) and then apply several layers of transparency over it (e.g. to simulate volumetric fog). Transparency requires a mathematical operation between the old info from the frame buffer (the dithered scene) and a new texture. Basically. you read an old dithered value from the buffer, do some math with it, and write it again dithered to the frame buffer. If you repeat this "read, modify, dither, write" cycle several times, you end up with an image that has been dithered not once but several times. Repeated dithering will cause artifacts that can not be recovered by simple filtering. The spatial detail and color is lost in the dither pattern. For more info read the dithering article here

I noticed that a couple of other articles about 16 vs 22 vs 24-bit showed sample images that used Adobe PhotoShop dithering. The dithering used by this program is called "error-diffusion" dithering. This is a very advanced algorithm that requires a completely finished image before it can be applied. 3D renderers have to write pixel-per-pixel dithered (rendering is pixel-per-pixel). As a result, "error-diffusion" dithering can not be used by traditional renderers and the images shown in those comparisons are useless. 3d cards use Matrix-based dithering and the results of these algorithms are a couple of levels of quality lower.

Another comment I received several times is this:

"One other interesting consequence is that the screenshot comparisons that some sites are making of the image quality from cards using 32-bit rendering against voodoos using 16/22 bit rendering are invalid, since the images from voodoos are incorrect. The only valid comparison will be from photographs!"

What these people are getting at is this: the filter is located between the frame buffer and the RAMDAC, kind of like between the buffer and the screen. Now, if you take a screenshot you read the data in the frame buffer without going through the filter, the result would be a screenshot that is 16-bit and not 22-bit, since the filtering hasn't been done. Right now I'm tempted to make the same conclusion (at least if 3dfx uses this technique and if the capture programs don't apply this post-filtering in software). Once I receive a Voodoo3 board I will be able to check for differences between real onscreen quality and frame buffer captured quality.

And last but not least:

" ... Secondly, Kristof Beets' article is based entirely upon speculation. He has little idea about the kind of filtering done by the Voodoo cards, and yet still goes ahead and makes lots of silly conclusions based upon his assumptions.  This is journalism?  Please! ... "

Hmm. It seems like I didn't make it clear enough that the algorithm I described in my first article was just a guess, so let me go on the record. There is absolutely NO proof that the advanced method suggested in this article IS the method being used by 3dfx. This method is just MY guess about a possible technique that MIGHT be used by 3dfx to achieve 22-bit output from 16-bit dithered input. It's completely up to you, the reader, to decide whether or not 3dfx is using this technique.