Now typically a game such as Quake II will have a depth complexity of about 3x, resulting in a 3x fill-rate hit. Now with T&L the problem is that you'll be dealing with a great deal more triangles that are a lot smaller. With that, you're going to be experiencing a great deal more overlapping of the triangles and in turn more depth complexity. To illustrate this, here is a picture of NVIDIA's tree demo running in wire frame. Looking at the screenshot below you can really see how much depth complexity there is. Now of course it should be considered that just technology demo. The purpose was to show the highest quality that an object could reach and still maintain a high level of performance. Now because of this, not only does this object look better than it would within any actual game, but depth complexity is also greater. The purpose of us showing this though is that just while T&L will make objects look better, it will also cause a great fill-rate hit and triangle rate hit because of overdraw.





Here is a zoomed in shot to more closely show what I'm speaking of:



So how can this problem of depth complexity be solved? Well unfortunately there are really only two solutions: either greatly increase fill-rate or switch to a completely different architecture. Currently, only tile based deferred rendering is capable of totally ignoring this problem by rendering only what you see. Basically what happens is the scene is mapped into separate tiles and each of these tiles is rendered. However, instead of rendering everything within the scene (as in behind solid objects) only the objects you actually see are rendered. So there are no depth complexity issue with this architecture even if there is a great deal of it within the actual scene. Currently, only the PowerVR based chips offer such a solution at the consumer level though, so don't expect to see many companies using this type of architecture any time soon. The only other way is really just to increase the fill-rate. This is what most vendors seem to be doing.

One very large issue with T&L hardware involves vertex buffers. The problem with a vertex buffer is that it takes up X amount of space depending on the number of triangles used. It is EXTREMELY difficult to calculate the size a vertex buffer would be even if you know the amount of triangles used. For reference though, you can assume that on average each triangle takes up about 50 bytes of memory. The problem with figuring out the total memory is that information can be reused. It really varies with every game, but the point being is that it takes up space.

So just why is the vertex buffer a problem and what can be done about it? Well the problem is that your frame-buffer is usually limited to 32-megs of memory, and in rare cases, 64. Now assume you want to play at 1280x1024, 32-bit rendering and a 24-bit Z-buffer. This is going to take 19,200 KBs of space, or about 19 Megs. Now assume that you're playing Q3A. The static environment (basically everything but weapons and character models) takes up about 3-4 Megs of memory (and this isn't a triangle heavy game, using T&L to the max as future games will, resulting in more memory being used). Now the problem here is you're limited to about 10 Megs of space for textures. As we know, Q3A can use up to something around 52 Megs of textures, but lets just say that you are using 20 Megs. What's going to happen is there will be a lot of AGP texturing. This of course results in a considerable slow down, especially if you aren't using AGP 4X. So what can be done to alleviate this problem? Well some might just say to store the information in system memory, but this can cause drastic slow downs (again, especially if AGP 4X isn't used). Well the obvious answer is geometry compression.