Looking for Bottlenecks - Engine Clock

Before we go any further, please be aware that the following exercise is in no way exhaustive. It's also not intended to be the ultimate truth on the matter...it's just our quick look at what potential avenues for improving Cypress would be, without any architectural changes.

What we did is quite straightforward: we varied engine and memory clocks independently (when one was the variable the other was held constant), and measured performance for each datapoint in a few select titles. Our choices with regards to tested games were motivated by the presence of an in-built testing utility, to ensure result comparability, as well as the fact that all six seemed to run rather nicely on Cypress, with no apparent bugs or sub-optimal behavior. Now, let's first look at the resulting performance "curves", and then move on to analysis:


Cypress Scaling-Engine 0xAA/0xAF

  600MHz  650MHz  700MHz  750MHz  800MHz  850MHz 
Engine Clock  1.08  1.17  1.25  1.33  1.42 
Crysis:Warhead  1.08  1.15  1.19  1.27  1.35 
Stalker:CoP  1.07  1.15  1.22  1.3  1.37 
FarCry 2  1.08  1.14  1.18  1.22  1.26 
Battleforge  1.08  1.16  1.23  1.27  1.34 


Cypress Scaling-Engine 4xAA/16xAF

  600MHz  650MHz  700MHz  750MHz  800MHz  850MHz 
Engine Clock  1.08  1.17  1.25  1.33  1.42 
Crysis:Warhead  1.08  1.17  1.25  1.33  1.42 
Stalker:CoP  1.08  1.17  1.23  1.27  1.36 
FarCry 2  1.06  1.17  1.22  1.22  1.28 
Battleforge  1.08  1.17  1.2  1.26  1.29 


Cypress Scaling-Engine 8xAA/16xAF

  600MHz  650MHz  700MHz  750MHz  800MHz  850MHz 
Engine Clock  1.08  1.17  1.25  1.33  1.42 
Crysis:Warhead  1.08  1.15  1.2  1.2  1.25 
FarCry 2  1.08  1.16  1.2  1.25  1.27 
Battleforge  1.08  1.14  1.17  1.24  1.28 


Looking at the pretty graphs outlines the fact that there's a strong corellation between the engine clock and performance, although it's not a 1:1 relationship : a 1% increase in engine clocks translates into a less than 1% performance increase. This is not necessarily surprising, since gaming workloads are very complex intertwined affairs these days, in which different fractions of the rendering workload are affected by different bounds. Which brings us to part two of this exercise, which is determining the rendering fraction that's affected/limited  by the engine clock. For this, we'll use the law everyone likes to quote these days, Amdhal's, in its general form, which seeks to measure the speedup an improvement produces. That means any improvement, not only the addition of extra parallel  processing resources! This needed noting since everyone seems to be using poor Amdhal in the context of parallel processing:


And using that nice equation, we get the following fractions:

Cypress-Fraction Affected by Engine Clock

  0xAA/0xAF  4xAA/16xAF  8xAA/16xAF 
Crysis:Warhead  0.87  0.91  0.78 
Stalker:CoP  0.92  0.74   
FarCry 2  0.71  0.77  0.73 
Battleforge  0.86  0.76  0.74 

We'll discuss those numbers in just a bit, but before that we'll repeat the above exercise for the memory clock.