Beyond3D - ATI R300 & NVIDIA NV30 - A Technical Comparison

ATI R300 & NVIDIA NV30 - A Technical Comparison - Page 10

Published on 28th Oct 2002, written by Zephyr Ni Qun for Consumer Graphics - Last updated: 14th Jun 2007

Part Three: Summary

Specification Comparison

Letâ€™s look at how ATI and NVIDIA evaluate their proud products in whitepapers first.

ATI: The RADEON 9700 is the most advanced graphics processor ever created. With 107 million transistors, it is a completely new architecture designed around the concepts of high bandwidth, parallelism, efficiency, precision, and programmability. The performance of this new architecture is staggering, more than doubling anything on the market today in every category.

NVIDIA: The â€œCineFXâ€ architecture, in combination with the high-level Cg programming language, enables the paradigm shiftâ€” the convergence of real-time and cinematic quality rendering. The key factors contributing to this milestone in real-time rendering include: advanced programmability, high-precision color, high-level shading language, highly efficient architecture and high bandwidth to system memory and CPU.

Ok, letâ€™s compare R300 with NV30 by words from their parent companies.

	R300	NV30	Comments
Programmability	High	Advanced	Advanced > High
Shading language	N/A	High-level	Cg is a highlight of NV30
Bandwidth	High	High (but only to system memory and CPU)	NV30 only targets AGP 8x, but R300 targets both AGP and Video memory.
Parallelism	High	N/A	ATI emphasizes its 4VEs and 8 PPs design.
Efficiency	High	Highly efficient architecture	NV30 emphasizes its architecture
Precision	High (96bit)	High-precision color	NV30 > R300
Scalability*	256	?	NVIDIA seems to dislike multi-chip design.

* not their companiesâ€™ words, just mine

We find that both ATI and NVIDIA highlight programmability, bandwidth, efficiency and precision, overlapping about 80% of each other. Because NVIDIA does not emphasize the memory bandwidth of NV30, the possibility of 128bit interface in NV30 increases.

Of course, R300 supports DX9 HLSL and will support OpenGL2 HLSL, as does NV30.

Vendor		ATI	NVIDIA	ATI	NVIDIA
General Specification	Name	Radeon 9700	NV30	Radeon 8500	GeForce 4 Ti
	Model	Pro	-		4600
	Manufacturing Process	0.15um	0.13um	0.15um	0.15um
	Transistor Count	107M	100-120M	63M	68M
	AGP	8x	8x	4x	4x
	Core Clock Rate	325 MHz	400 ~ 450MHz	275 MHz	300 MHz
Tessellation Unit	N-Patches	Ã¼	Ã¼	Ã¼	-
	Adaptive Tessellation	Ã¼	Ã¼	-	-
	Continuous Tessellation	Ã¼	Ã¼	-	-
	Displacement Mapping	Ã¼	Ã¼	-	-
Vertex Processing Unit	Shader Version	2	2.0+	1.1	1.1
	Number of Unit	4	3	2	2
	Hardware T&L Unit	-*	?	Ã¼	Ã¼
	Transform rate	325M Triangles/s	272~306M triangles/s	69M triangles/s	136M triangles/s
Pixel Processing Unit	Shader Version	2	2.0+	1.4	1.3
	Number of Unit	8	8	4	4
	Texture Unit(s)/Pipeline	1	2	2	2
	Textures/Pass	16	16	6	4
	Pixel fillrates	2.6G Pixels/s	3.2~3.6G Pixels/s	1.1G Pixels/s	1.2G Pixels/s
	Texel fillrates	2.6G Texels/s	6.4~7.2G Texels/s	2.2G Texels/s	2.4G Texels/s
Memory Architecture	Bus width and memory type	256bit/DDR (R300 also supports DDR-II)	128bit/DDR-II	128bit/DDR	128bit/DDR
	Crossbar Controller	4 x 64bit	4 x 32bit	-	4 x 32bit
	Capacity	128MB (256MB Max for R300)	128MB/256MB	128MB	128MB
	Vertex Cache	Ã¼	Ã¼	Ã¼	Ã¼
	Primitive Cache	Ã¼	Ã¼	?	Ã¼
	Color/Pixel Cache	Ã¼	Ã¼	-	Ã¼
	Texture Cache	Ã¼	Ã¼	Ã¼	Ã¼
	Z Cache	Ã¼	?	-	-
	Frequency	620 MHz	800M ~ 1 GHz	550 MHz	650 MHz
	Bandwidth	19.8 GB/s	12.8 ~ 16 GB/s	8.8 GB/s	10.4 GB/s
Bandwidth Optimization	Color Compression	Ã¼(12:1)	Ã¼	-	-
	Fast Color-Clear	Ã¼	?	-	-
	Z-Compression	Ã¼(24:1)	Ã¼	Ã¼	Ã¼
	Fast Z-Clear	Ã¼	Ã¼	Ã¼	Ã¼
	Early Z-Culling	Ã¼	Ã¼	-	Ã¼
	Hierarchical Z	Ã¼(64 Pixels /Cycle Max, 3 levels)	?	Ã¼	-
	Compressed Textures	Ã¼	Ã¼	Ã¼	Ã¼
Image Quality Enhancement	Programmable Multi-Sampling	Ã¼(Non-Grid)	Ã¼	-	-
	Per Sample Gamma Correction	Ã¼	?	-	-
	Program Controlled Filtering	?	Ã¼(TXD)	-	-

* While, unlike R200 and NV25, R300 doesn't have a separate T&L unit alongside the Vertex Shaders to execute legacy T&L code over, all T&L operations are carried out in hardware via the Vertex Shader units. Although currently uncomfirmed it's likely that NV30 will do the same.

Whether NV30 uses a 256bit memory interface or not is as yet unknown. In my opinion, NV30 has more chance to have a 128bit memory interface than a 256bit memory interface due to many hints. According to Samsung, its 1GHz DDRII memory module will be yielded volumetrically in the 3rd quarter, 2002.

R300's raw memory bandwidth, advanced memory compression (12:1 on color, 24:1 on Z) and hierarchical Z (hit 64 pixels per cycle of raw fill) do impress us here, which is the basis of the excellent presentation of R300 on FSAA.

From this table, we can estimate the performance gap among NV30, R300, NV25 and R200, if ignoring the factors of CPU power and driver quality.

In comparison to R200, NV25 will get obvious superiority at games with many complex geometric objects or sensitive to memory bandwidth.
R300 is better than NV25. Similar to the relation between NV25 and R200, R300 also has clear advantage at games with many complex geometric objects or sensitive to memory bandwidth. Besides, R300 will get more advantage over NV25 when running games written by complex shader programs. Currently, the distinct gap between NV25 and R300 when running games with 4xFSAA and/or 16xAF mainly comes from memory bandwidth difference, the lower performing implementation of AF in NV25 and the use of color buffer compression with R300 when using FSAA.
We can predicate that the NV30 will have more power than R300, depending on different 3D applications. It is possible that NV30 will gain more fame when running future 3D games with complex shading effects (Note: its highly likely that we will be serveral architectural generations down the line before this level of complexity in games actually occurs. Look at DoomIII as an example as this is is only just making use of the majority of features introduced in NV15/R100 - Ed.). And we also can see that 1GHz Samsung DDR II plays a key role about the real performance of NV30.

ATI R300 & NVIDIA NV30 - A Technical Comparison - Page 10

Part Three: Summary

Specification Comparison

Page Navigation