SwiftShader 2.0: A DX9 Software Rasterizer that runs Crysis

Friday 04th April 2008, 12:00:00 AM, written by Arun

TransGaming has just released SwiftShader 2.0, an highly optimized software rasterizer that supports DX9 and Shader Model 2.0 and scales with multi-core processors. It can run (albeit slowly) many modern games and it makes a dual-core Penryn perform similarly to the GeForce FX5600/5700 in 3DMark05.

Nicolas Capens (aka Nick on our forums) is the creator and lead programmer behind SwiftShader - he started by writing the DX7/DX8 swShader many years ago, and eventually turned it into a commercial product in 2005 with the help of TransGaming and Gavriel State in particular.

A demo is now available on TransGaming's website, with which we ran 3DMark05 and obtained a score of ~400 on a stock Core 2 Duo E8400. That's still not mind-blowingly fast, but keep in mind Direct3D's reference rasterizer would likely score in the single digits and the SGX-based IGP in Intel's upcoming Silverthorne-based Menlow platform for UMPCs/MIDs is claimed to only score ~150. It would also be much more than enough to run Vista's Aero interface smoothly.

Finally, we were told SwiftShader would run Crysis in the mid-single digits at the lowest settings on Intel quad-core systems. Definitely not very playable yet, but that should make it clear SwiftShader is perfectly usable for casual games. We look forward to seeing how SwiftShader evolves in the future and how it will perform on future high-end CPUs such as Intel's Nehalem and AMD's Shanghai - certainly it might be a fun way to benchmark CPUs once in a while.


Discuss on the forums

Tagging

graphics ± swiftshader, software, rasterizer, 3dmark05, crysis


Latest Thread Comments (189 total)
Posted by HolySmoke on Tuesday, 14-Jul-09 18:55:32 UTC
Quoting Scali
Doesn't that have to do with the virtual videomemory system in D3D10 though?
From what I understood, in DX9 all texturememory is mapped into the virtual address space at all times. But with DX10 they aren't mapped into the address space at all unless you specifically Map() them...?
That's pretty much what I meant. What I was getting at (poorly, now that I re-read the post) was that because the game features an in-engine streamer you may have to control for it.

From what I understand, this built-in streaming engine was introduced to prevent the game from crashing at the 32-bit limit in DX9 mode at the higher texture settings. While I've never experienced it myself, I know that some setups can't run the game at full detail with streaming disabled for that reason. But while it's a fine solution to a DX9 issue, it's wholly unnecessary when running in DX10 while still being enabled by default. So, ideally, you'd want to disable in-engine texture streaming manually during testing (edit: setting textures to low or medium achieves the same result).

Don't get me wrong, I've never managed to get Crysis to run faster under DX10 than DX9 and I think it's definitely an engine issue. But if you want to test the actual rendering section of the engine (especially since you're using custom shaders) then you'd want to make sure that the streaming portion is disabled because it's a performance affecting workaround to an altogether unrelated problem.

Posted by Davros on Tuesday, 14-Jul-09 20:23:59 UTC
wouldnt the crysis devs know about streaming in dx10 and disable it in preference of their own streaming
plus the 32bit limit in dx9 would surely still exist in dx10 (vista32)

Posted by Scali on Tuesday, 14-Jul-09 20:50:31 UTC
Quoting Davros
plus the 32bit limit in dx9 would surely still exist in dx10 (vista32)
Not if you only map memory when required.
As long as the CPU doesn't need to access the videomemory, there's no reason for it to take up address space on the CPU side.

All I know is that he has a point.
I have a PC with a Radeon X1900XTX, which reports WAY higher memory usage than when running Crysis on a GeForce 8800.
The Radeon goes over 1.5 GB, sometimes close to 2 GB, while the GeForce uses about 1 GB to 1.2 GB. I don't know what causes it, but the DX9 machine just uses way more memory than the DX10 one does, despite having higher detail.

Posted by Demirug on Tuesday, 14-Jul-09 21:19:03 UTC
How many times this is comes up again?

The typical video memory window is 256 MB. It needs to be mapped to the address space of any application that uses Direct3D. Vista is somewhat smarter here as it can dynamically change the size of the mapped window. But this can although cause crashes if the window needs to grow and there is no address space left.

When it comes to resource allocation Direct3D 9 and 10 behaves different.

10 is quite easy as any resource needs its own size of the address space. This is even true for textures that are in the video memory. The reason for this is that the virtual video memory manager must be able to swap the resource to the system memory. For different reasons they are swapped to the process that owns the resource.

9 is more complicated. Before SP1 any managed resource needs twice the address space. One time for the system memory copy and one time for the real video resource. This was done for compatibility reasons. But with graphics cards that contain a large amount of video memory and only 2 GB address space you run in problems. Therefore there was a hotfix that becomes part of SP1. This hotfix tries to eliminate the address space requirement for the real resource when possible. But it still needs more address space then 10.

I don’t have the exact numbers for BattleForge here but it requires significant less address space with Direct3D 10 compared to 9.

Posted by angrylion on Tuesday, 08-Jun-10 11:54:42 UTC
Quoting Nick
That might become a good idea in the near future. Currently we have a whitepaper explaining the overall architecture, but we exchange optimization details on a per-customer/application basis.
Sorry to bring the old topic up, but am I right that this whitepaper (mentioned also on several other sites, sometimes with quotations) has never been available for free?

Posted by Nick on Monday, 14-Jun-10 16:48:25 UTC
Quoting angrylion
Sorry to bring the old topic up, but am I right that this whitepaper (mentioned also on several other sites, sometimes with quotations) has never been available for free?
I'm not sure if it was ever made available publicly, but you might be able to easily get a copy by e-mailing swiftshader@transgaming.com. Much of it has been quoted on some sites anyway and if I recall correctly it didn't contain proprietary information.

Anyway, do you have a specific question about SwiftShader? I might be able to clarify some technical aspects.

Posted by Novum on Monday, 14-Jun-10 18:06:04 UTC
Are you actually still working on it? SM 3?

Posted by Nick on Tuesday, 15-Jun-10 11:35:21 UTC
Quoting Novum
Are you actually still working on it? SM 3?
Yes, I'm still working on it (ANGLE (http://code.google.com/p/angleproject/) was an short diversion). I'm not sure if I'm at liberty to publicly talk about new features though, but again, feel free to contact swiftshader@transgaming.com for any inquiries.

Posted by angrylion on Sunday, 25-Jul-10 22:24:04 UTC
Quoting Nick
I'm not sure if it was ever made available publicly, but you might be able to easily get a copy by e-mailing swiftshader@transgaming.com. .
I e-mailed at that address more than a month ago and didn't receive a reply. I'd appreciate if you could consider uploading the whitepaper. Of course, if it's indeed free and not confidential.

Posted by Nick on Tuesday, 27-Jul-10 15:26:58 UTC
Quoting angrylion
I e-mailed at that address more than a month ago and didn't receive a reply. I'd appreciate if you could consider uploading the whitepaper. Of course, if it's indeed free and not confidential.
To avoid pulling attention to this aging version of SwiftShader we won't upload the whitepaper at this time, but please send a request to swiftshader@transgaming.com again. Your previous e-mail appears to not have made it through. Sorry for the hassle.


Add your comment in the forums

Related graphics News

Travelling in Style: Beyond3D's C++ AMP contest
Beyond Programmable Shading CS448s first slides available
Khronos release OpenGL 3.3 and 4.0
Mazatech release AmanithVG 4.0, supporting OpenVG 1.1
OpenGL 3.0 is here (finally)
[Analysis] TSMC 40G to deliver up to 3.76x the perf/mm^2 of 65G & Power Implications
Old News: AMD CTO resigns, NVIDIA CFO retires, DDR3 for MCP7A, S3, etc.
S3 launches DirectX 10.1 Chrome 400 GPUs
GPGPU and 3D luminaries join 3D graphics heavyweights
The Technology of a 3D Engine - Part One