Beyond3D - ATI R300 & NVIDIA NV30 - A Technical Comparison

ATI R300 & NVIDIA NV30 - A Technical Comparison - Page 6

Published on 28th Oct 2002, written by Zephyr Ni Qun for Consumer Graphics - Last updated: 14th Jun 2007

Part Two: Pixel Shader

Pixel Shaders, just like vertex shaders, were introduced by DirectX 8.0. In DirectX 9.0, PS gains more power over the old version than VS do. Except for the same enhancements such as more instruction slots and more new instructions, there is a revolution in the execution method of the instruction. The pixel processing units in R300 and NV30 is not a series of simple logical switches (stages) which can only process simple texturing and color blending operations, but a true processing unit which can overcome the dramatic decrease of pixel fillrate when running a long pixel shading program.

Click for a bigger version

ATI's R300 Natural Light Demo Demonstrates the the DirectX9 Floating Point Pixel Shader Pipeline

Click for a bigger version

Pixel processing Power

	DX8 PS1.4	DX9PS2.0	R300	NV30	DX9PS3.0
Pixel Shader Unit	1.4	2.0	2.0	2.0+	3.0
Float precision	-	Ã¼	Ã¼	Ã¼	Ã¼
Max texture input (per pass)	6	16	16	16	16
Max texture addressing instructions	12	32	32	1024*	1024*
Max arithmetic/color instructions	16	64	128**	1024*	1024*
Instruction Slots	28	96	160	1024	1024
Max Runtime Instruction Number	28	96	160	1024	64k
Unlimited texture fetches	-	-	-	Ã¼	?
Static flow control	-	-	-	-	Ã¼
Dynamic flow control	-	-	-	-	Ã¼
Per channel masking	-	-	-	Ã¼	Ã¼
Multi render target	-	Ã¼	4	-	Ã¼

Note:
1) - No Support; Ã¼ Support; ? Unknown;
2) * Universal instruction slots. That is to say color and texture instructions can share the same instruction spaces.
3) ** R300 has 32 texture instructions, and 64 ALU instructions each for scalar and vector. R300 can issue instructions from each of the instructions set each cycle, and consequently execute 3 instructions per cycle. A nice design!

It seems that DX9 PS2.0 follows the specification of R300 exactly. The strength of NV30 PS specification impresses all of us. It is qualitative difference that 1024 instruction slots is compared with 160 slots and also has significant value to DCC applications. We see the influence of NV30 in DX9 PS3.0 too.

Another interesting thing about NV30 is that its PS instructions are stored in local video memory instead of chip interval. This method has two sides: it makes managing lots of fragment programs cheap though it also adds pressure to the already limited memory bandwidth.

ATI R300 & NVIDIA NV30 - A Technical Comparison - Page 6

Part Two: Pixel Shader

Pixel processing Power

Page Navigation