SMOOTHVISION 2.0

Radeon 8500 was supposed to feature a programmable jitter sample position FSAA scheme for enhanced image quality; however, this scheme operated on a Super-Sampling basis which results in quite a performance impact (PJSS). Although Radeon 8500's PJSS FSAA scheme is included in R300, it has not been included in the drivers for Radeon 9700 PRO. Instead, Multi-Sampling is utilised.

R300's Mulisampling scheme also features the programmable jitter sampling system (PJMS) allowing for many other sampling patterns than just straight ordered grid or fixed jittered grid, and thus resulting in a higher level of edge anti-aliasing. 2, 4 or 6 Multi-Sample samples can be taken per pixels, each with a programmable jitter table forcustom sample patterns. However, if no custom patterns are used defaults will be applied.

The FSAA Sampling method on R300 also features a Gamma correction technique. Some 3D applications provide Brightness settings to allow you to adjust the outputted display to suit the particular Gamma curve of your display screen. R300's AA can read in these application settings and take them into account when generating the final FSAA image so that the actual averaged sample is best suited to the gamma curve of your display, thus reducing the visible aliasing on the edges.


Click for a bigger version

FSAA Gamma Correction


It's previously been mentioned that ATI's Multisampling scheme had a method of applying Anti Aliasing to Alpha textures, a potential issue that Multi Sampling FSAA overlooks. It transpires that R300 supports the OpenGL 1.2 alpha coverage mask call, which can be used to convert Pixel Shader output to anti-aliased coverage. If this were forced on then there could be some issues with older titles, such as Half Life, which already use this function for different purposes. At the moment ATI have not included an option to force this via the driver but may still at some point. There is presently no DirectX equivalent as yet so this would be specific to OpenGL titles at the moment anyway.

SMOOTHVISION 2.0 also features an updated version of anisotropic filtering. Radeon 8500 introduced a very fast anisotropic filtering technique, though it was limited to only bilinear sampling and also operated poorly when the view was rotated about the Z Axis. R300 has two anisotropic filtering modes: Performance, which acts similarly to Radeon 8500's mode, and Quality, which can also operate in conjunction with trilinear filtering. ATI also says that the scheme, which applies to 'Quality' and 'Performance' modes, has been updated to take better account of texture sampling of polygons that have been rotated around the Z-Axis.

HYPER-Z III

The same basic principals of HYPER-Z II on Radeon 8500, Hierarchical Z Buffer, Fast Z clear and Z compression, have been taken across to R300, though they have been extended...

If you remember back to our Radeon 8500 review, we explained that HYPER-Z II's Hierarchical Z Buffer consisted of two layers which, if a positive occlusion was met on all parts of the lower resolution Z buffer, could reject up to 64 pixels worth of data and save passing that to the frame buffer, thereby minimising bandwidth usage. However, apparently it would seem that HYPER-Z II did not feature an early Z detection routine on the display resolution Z buffer, meaning that if the pixels being tested failed the hierarchical test all the pixels would be rendered regardless of whether some of them would have actually failed. Without early Z detection at the display resolution buffer this means that all the texturing and even possibly lengthy pixel shading operations would be carried on pixels regardless of whether it's occluded or not.

One of the reasons that can dissuade a manufacturer from using an early Z detection routine is that it can possibly cause a stall in the pipeline and the number of cycles it takes to clear the pipeline can be more than just carrying out the pixel operations entirely. However, with the number of pixel operations increasing exponentially with the potential levels of pixel shader program lengths any potential stall in the pipeline caused by early Z rejection shrink in comparison if such pixel shader operations are used on occluded pixels. To this end, ATI have enabled early Z rejection in HYPER-Z III and R300, such that before any operations are carried out on the pixel the depth is checked first via the Hierarchical Z buffer and then, if no positive occlusions are found, on the display resolution buffer. And if it is deemed to be occluded by another opaque pixel it is rejected prior to any other operations being carried out.

As mentioned before, Radeon 8500's Hierarchical Z Buffer only featured two levels - a single 'low resolution' buffer and the full resolution buffer. With HYPER-Z III ATi have taken this one step further and gone to 3 layers. This would tend to suggest that where Radeon 8500 groups pixels in 4x4 tiles of the lower resolution buffer R300 includes a middle layer that groups pixels in 2x2 blocks. The ramifications of this will be that smaller occluded triangles will be detected more frequently and hence more pixels can be rejected before queries on the display resolution buffer occur; however, it does mean that more local memory will be dedicated to the Z buffer.



3 Level Hierarchical Z Buffer


Another element to HYPER Z III is loss-less Z compression. This results in no image issues but can compress the Z buffer, reducing the bandwidth needed for Z buffer reads or writes. ATI claim a minimum of a 2:1 compression ratio and a best case of 4:1 during normal rendering. However, when FSAA is enabled the compression ratios will nearly scale linearly with the FSAA sample number, so, for instance, with 6X FSAA applied the best compression ratio achieved will be 24:1. Including this for FSAA operations can significantly reduce the overhead for FSAA processing.



Z Compression


Alongside Z buffer compression, R300 also feature colour compression when FSAA is enabled. Because of the way multisampling operates much of the subsamples contain the same colour data, its only at polygon intersections are there ever more than one colour value over all the subsamples. R300 is able to compress the colour samples and achieve a very high compression ratio with mulitsampling.

Finally HYPER Z III also features Fast Z Clear. Between each rendered frame the Z buffer needs to be reset, which means the values of the z buffer have to be cleared. If the display resolution is set to 1600x1200 roughly 7.7MB of data must be cleared and if this is being done on an individual basis this can both take time and bandwidth. HYPER Z III clears the Z buffer in blocks which requires only 1/64 of the amount of data to be passed.



Fast Z Clear