Now typically a game such as Quake II will have a depth
complexity of about 3x, resulting in a 3x fill-rate hit. Now with T&L
the problem is that you'll be dealing with a great deal more triangles
that are a lot smaller. With that, you're going to be experiencing a great
deal more overlapping of the triangles and in turn more depth complexity.
To illustrate this, here is a picture of NVIDIA's tree demo running in
wire frame. Looking at the screenshot below you can really see how much
depth complexity there is. Now of course it should be considered that
just technology demo. The purpose was to show the highest quality that
an object could reach and still maintain a high level of performance.
Now because of this, not only does this object look better than it would
within any actual game, but depth complexity is also greater. The purpose
of us showing this though is that just while T&L will make objects
look better, it will also cause a great fill-rate hit and triangle rate
hit because of overdraw.
Here is a zoomed in shot to more closely show what I'm speaking
of:
So how can this problem of depth complexity be solved? Well
unfortunately there are really only two solutions: either greatly increase
fill-rate or switch to a completely different architecture. Currently,
only tile based deferred rendering is capable of totally ignoring this
problem by rendering only what you see. Basically what happens is the
scene is mapped into separate tiles and each of these tiles is rendered.
However, instead of rendering everything within the scene (as in behind
solid objects) only the objects you actually see are rendered. So there
are no depth complexity issue with this architecture even if there is
a great deal of it within the actual scene. Currently, only the PowerVR
based chips offer such a solution at the consumer level though, so don't
expect to see many companies using this type of architecture any time
soon. The only other way is really just to increase the fill-rate. This
is what most vendors seem to be doing.
One very large issue with T&L hardware involves vertex buffers. The
problem with a vertex buffer is that it takes up X amount of space depending
on the number of triangles used. It is EXTREMELY difficult to calculate
the size a vertex buffer would be even if you know the amount of triangles
used. For reference though, you can assume that on average each triangle
takes up about 50 bytes of memory. The problem with figuring out the total
memory is that information can be reused. It really varies with every
game, but the point being is that it takes up space.
So just why is the vertex buffer a problem and what can be done about
it? Well the problem is that your frame-buffer is usually limited to 32-megs
of memory, and in rare cases, 64. Now assume you want to play at 1280x1024,
32-bit rendering and a 24-bit Z-buffer. This is going to take 19,200 KBs
of space, or about 19 Megs. Now assume that you're playing Q3A. The static
environment (basically everything but weapons and character models) takes
up about 3-4 Megs of memory (and this isn't a triangle heavy game, using
T&L to the max as future games will, resulting in more memory being
used). Now the problem here is you're limited to about 10 Megs of space
for textures. As we know, Q3A can use up to something around 52 Megs of
textures, but lets just say that you are using 20 Megs. What's going to
happen is there will be a lot of AGP texturing. This of course results
in a considerable slow down, especially if you aren't using AGP 4X. So
what can be done to alleviate this problem? Well some might just say to
store the information in system memory, but this can cause drastic slow
downs (again, especially if AGP 4X isn't used). Well the obvious answer
is geometry compression.