Higher Dynamic Range Rendering

NV40 also features Higher Dynamic Range rendering, which basically means it supports a floating point precision from input, all through the pipeline, to output.

Higher Dynamic Range is used to achieve a more natural look, since when looking at monitors we'll never be able to see the brightness ranges that our eyes can operate at naturally, so HDR techniques are used to effectively emulate within an application what we would see. For instance, in the image above the obelisks are not completely black, but because the light from the top is so bright, comparatively, we see the glow from it, which even partially obscures areas around the opaque object.

HDR is used to calculate the gradients in brightness in an attempt to emulate where areas of glow would be seen in a natural image. Various methods of calculating HDR have been used since programmable shaders have been introduced, and one method that has been picked up on recently is using an Extended sRGB method that renders to a 16-bit render target; however, one issue with this, apart from it needing 11 pixel shader operations to achieve, is that no blending operating can occur on the render target which causes issues for titles with multiple lights. NV4x features a native HDR scheme, fully compatible with the ILM developed FP16 OpenEXR format, which is used as an input format and an output target format for films. The requirements for this are:

  • Floating Point Shading
    NV4x is already fully FP32 capable throughout the shading pipeline.
  • Floating Point Blending
    Alongside blending operations this can also be useful for multiple dynamic light calculations and accumulation effects such as motion blur and soft shadows.
  • Floating Point Texturing & Filtering
    NV4x offers support for filtering of FP16 textures, inclusive of Trilinear filtering and up to 16X Anisotropic.
  • High Precision Colour Buffer
    The standard Windows sRGB colour storage format is not sufficient for High Dynamic Range rendering as there is insufficient range and precision; however, OpenEXR does have a wider range. sRGB can still be utilised for tone mapping, colour, and gamma correction phases whilst OpenEXR could be used for storage, blending, shading, texturing and filtering during the light transport phase

Naturally, storing a colour buffer at FP16 format is not only going to have memory space requirements but also a much higher bandwidth overhead than normal rendering, to the tune of twice the bandwidth per pixel. We asked David Kirk if rending in an FP16 format would also incur a fill-rate penalty (i.e. taking two cycle per pixel) and he would only respond that fill-rate usually scales with bandwidth and that that is likely to be the case here, which implies there is an associated fill-rate penalty as well. Colour compression algorithms will be different for the high precision buffer, but these will be implemented as well.

Although most of the pipeline operations work under the OpenEXR format, at present the FSAA multisampling scheme does not.