id Software's demonstration of the work-in-progress of DOOM3 at the recently concluded E3 have left many breathless when it came to the graphics aspect. While John Carmack, main programmer and the man responsible for various game engines of past id Software classics, have given us his thoughts on the technological aspects and hardware requirements in past dot plan files and random postings and email replies, many are still curious about DOOM3's engine and hardware compatibility.
I'd picked a few interesting questions raised by Beyond3D forum members, re-phrased them a little for clarity and sent them off John's way. A few of John's answers were original posted by me in a thread in the forums while some are new. I'd asked John for short, to-the-point answers since he'd said he doesn't have much time for interviews. Here we go :
The game characters are between 2000 and 6000 polygons. Some of the heads do look a little angular in tight zooms, so we may use some custom models for cinematic scenes. Curving up the models with more polygons has a basically linear effect on performance, but making very jagged models with lots of little polygonal points would create far more silhouette edges, which could cause a disproportionate slowdown during rendering when they get close. TruForm is not an option, because the calculated shadow silhouettes would no longer be correct.
Higher precision rendering. It appears that the GF3/GF4Ti clamps the results (including intermediate ones) when some part of the calculations goes over 1.0. The Radeon 8500, with up to 8.0 higher internal ranges, can keep higher numbers in the registers when combining, which allows for better lighting dynamics. How much will this have an impact in DOOM3's graphics?
At the moment, it has no impact. The DOOM engine performs some pre modulation and post scaling to support arbitrarily bright light values without clamping at the expense of dynamically tossing low order precision bits, but so far, the level designers aren't taking much advantage of this. If they do (and it is a good feature!), I can allow the ATI to do this internally without losing the precision bits, as well as saving a tiny bit of speed.
Multiple passes. You mentioned that in theory the Radeon8500 should be faster with the number of textures you need (doing it in a single pass) but that the GF4Ti is consistently faster in practice even though it has to perform 2 or 3 passes. Could this be due to latency? While there is savings in bandwidth, there must be a cost in latency, especially performing 7 textures reads in a single shader unit.
No, latency should not be a problem, unless they have mis-sized some internal buffers. Dividing up a fixed texture cache among six textures might well be an issue, though. It seems like the nvidia cards are significantly faster on very simple rendering, and our stencil shadow volumes take up quite a bit of time.
Several hardware vendors have poorly targeted their control logic and memory interfaces under the assumption that high texture counts will be used on the bulk of the pixels. While stencil shadow volumes with zero textures are an extreme case, almost every game of note does a lot of single texture passes for blended effects.