3Dlabs 'Slipstream Technology'

When the P10 architecture was initially brought to our attention it caused a little confusion as it had in its raster technique what 3Dlabs referred to as a tile. In fact it transpired that the 'tile' referred to just a pixel cache that allowed it to batch pixels up and 'burst' them to the external frame buffer. This was confusing because the 3D technology licensing division of Imagination Technologies, PowerVR, utilise an alternative architecture to traditional 3D chips entitled 'Tile Based Deferred Rendering'.

The key principal of PowerVR's architecture works by rather than rendering the scene directly, as the geometry data is sent to the 3D chip (known as Immediate Mode Rendering - IMR), as most 3D renderers do, the chip defers the rendering rendering of the scene. The objective of deferring the rendering is to store (or 'bin') the geometry locally to the card so that all the information for the scene is known before its rasterised. Once all the geometry is in the 'bin' it is then regionalised, or broken into tiles. These tiles match the size of an on chip, high bandwidth, pixel cache (or 'tile'), which can be rendered to extremely quickly. Another advantage of deferring the rendering is that all the Z values of the geometry will be known before the the raster pipeline renders the pixels, which means that prior to calculating the pixels all the geometry can be sorted into surfaces which can be seen and surfaces which can't - the surfaces that can't be seen can be discarded entirely before any pixel calculations are done on them (if you wish to learn more about PowerVR's Tiling architecture read our articles: PowerVR Tile Based Rendering, PowerVR Neon250 Review, Videologic Vivid! KYRO Review). The the main advantages of this architecture lie in the fact that there will be no pixel processing wastage on opaque overdrawn pixels and the on-chip tile, to all intents and purposes, acts as a frame buffer so that only when all the processing is done for each tile does a write to the external frame buffer need to be made, thus reducing the overall bandwidth requirements. The main disadvantage of this architecture is that the geometry for each frame needs to be stored thus using more memory up.

Its the aforementioned drawback to tile based deferred rendering that has probably hindered it from being more widespread than it is today. There have long since been discussions about how much geometry data will be required for future games and what issues and drawbacks there will be for such a system if there is insufficient local storage space to 'bin' an entire frame of geometry. Although these issues have largely proven to be non-issues thus far, the discussions continue -- and that's only where games are concerned, which are traditionally low geometry affairs in comparison workstation applications. Many people may find the idea that boards targeted for workstations could possibly utilise a deferred rendering method quite frankly absurd. I will admit that I had discounted this possibility, however it appears that this is exactly what 3Dlabs are trying to achieve with Wildcat VP560.




The exact details of what 3Dlabs are doing are a little thin on the ground at the moment, so I'll copy the information 3Dlabs have given out so far verbatim:

  • Patented deferred rendering technology to optimize silicon efficiency
    • Within the framework of a traditional graphics pipeline
    • Completely transparent to the driver and application software
  • Bin all primitives within a region to exploit spatial coherence
    • Significantly increases cache performance and reduces memory load
  • Use bin data to eliminate obscured primitives
    • Never use processor cycles to process pixels that will not be visible
  • Robust implementation – transparent to software drivers
    • Completely handles context and state
    • Gracefully handles overflow conditions
    • Efficiently handles small primitives
  • Real-world benefits:
    • 50% or more boost to real applications

Ostensibly this sounds very similar to what PowerVR have been achieving with their architecture for some time, however many questions need to be answered: How close is this to PowerVR's design? What method does it use to calculate obscured primitives? Is this enabled  full time or can it switch between immediate mode rendering and deferred rendering? How does it 'gracefully' handle the overflow conditions? Is this implemented in hardware and if so, why didn't P10 feature it?

Needless to say, we'll endeavour to find out more details of how 'Slipstream' is implemented in P9.

Probably P10's main Achilles heal in comparison to the current consumer boards out there was the lack of any sophisticated bandwidth saving schemes however its possible that 'Slipstream' may go some way to redressing that in P9. 3Dlabs state that 'Slipstream' will now be a feature in all their upcoming VPU's.

What of Creative?

We've recently heard that Creative, 3Dlabs parent company, will not be adopting the P10 for consumer oriented 3d boards and instead they will still utilise NVIDIA based boards in the shape on NV30 in the European market, and ATI's Radeon 9700 Pro in Asia. While this takes care of the high end, at least until 3Dlabs can bring chips more focused on the high end consumer market, we've not heard of Creative's plans for the low end.

When talking about future 3D products Creative were keen to point out the target of being able to see 3D chips at the 'SoundBlaster price point', in other words around the $99 mark. Given that P9 is a cut down version of P10 and that the manufacturing cost of the .15µ process has been going down does P9 represent a good opportunity for Creative to reach this price point? Whether Creative does bring P9 to the market is likely to hinge in cost, driver compatibility and the effectiveness of 'Slipstream'.