Getting PerfHUD 5 up and running

Since NVIDIA don't want PerfHUD to be able to profile any 3D application running on NVIDIA hardware, for obvious reasons, they sensibly made it opt-in for the ISV to be able to use it, where the application asks for support in its code.

To enable PerfHUD support in your Direct3D application, you first ask Direct3D for a list of available devices for rendering in your code. Assuming the instrumented display driver is installed, the query should return the standard display adapter(s) in your system, plus an extra one called NVIDIA PerfHUD. Your app then calls CreateDevice (D3D10CreateDevice for a D3D10 device) on the PerfHUD ordinal, also passing in a flag to tell the D3D runtime that you want the reference rasteriser type for the device (D3D10_DRIVER_TYPE_REFERENCE for D3D10, D3DDEVTYPE_REF for D3D9).

The combination of calling CreateDevice with the reference type on the PerfHUD adapter ordinal gets PerfHUD enabled in your application, with any other combination disabling support. That call type can cause problems for application integration, for (a popular) example if your app uses DXUT, the DirectX Utility framework. DXUT is a popular D3D helper framework that makes it easier to perform various mundane D3D tasks, including enumerating devices and calling CreateDevice (or the D3D10 equivalent) for you.

To give PerfHUD a good run out, we chose Andrew Lauritzen's summed area table variance shadow mapping demo (try saying that when you're drunk), written in C++ and making fairly heavy use of DXUT. The demo is shipping on the GPU Gems 3 CD, where it gets a starring chapter in the book, and Andy was kind enough to supply us with its code before Gems 3 showed up (at least through my front door, go Amazon!). DXUT makes some assumptions about device type creation which stop PerfHUD profiling out of the box, even if you force the adapter ordinal and device type using the command line.

Modifying DXUT to enable proper support of PerfHUD isn't difficult (after all, we managed!), but it's something to look out for if you use 3rd party code for any part of your interaction with Direct3D initialisation, since PerfHUD requires you create the device in non-standard ways. With your app code asking for PerfHUD properly, you simply drop it on the PerfHUD launcher icon installed as part of the PerfHUD installation sequence, and support is activated.

If you don't use the PerfHUD launcher to start your application, even with PerfHUD support compiled in, you won't be able to enumerate the adapter. Notice just the installed GeForce 8800 GTX in a test app, with no PerfHUD device.

Click for a bigger version

If you don't choose the right adapter when you create the Direct3D device in code, PerfHUD will warn you with an overlay.

Click for a bigger version

You'll then have to go and choose the right device and type.

Click for a bigger version

Click for a bigger version

The last consideration in getting PerfHUD working with your application is perhaps the most significant. Because of NVIDIA's driver architecture, you can only profile applications where the native binary shares the same bitness as your operating system, and therefore the display driver. That only matters for 64-bit Windows of course, but it means you can't profile x86 applications on that platform. Doing so gives you the following can't execute message.

Click for a bigger version

The PerfHUD 5 manual gives you code snippets for device creation for both D3D9 and D3D10, along with other device creation tips and tricks. While PerfHUD integration is relatively easy and well documented, we focus on the pitfalls and gotchas to highlight the limitations.