Analysis: Sampling and Filtering
All tests are done at 1920x1200, unless otherwise specified, and the Radeon HD 3870 (RV670) was downclocked to 750/800MHz (engine/memory), in order to bring it to parity with the Radeon HD 4770 (RV740) and remove the extra 20GB/s that the 3870 has in its default configuration, which would've hampered our attempts to isolate architectural evolutions. The 8800GT, however, was left at its default settings, so it retains a slight bandwidth advantage – messing with it would've been besides the point: its purpose is to give a bit of perspective.
That being said, we start off by looking at a pretty simple test case, involving point sampling from a 32x32 24bit RGB texture, which ensures that we're sampling from cache, so in theory we should be hitting near theoretical rates:

As you can see, things were pretty brutal back in the old days for the poor RV670 when it came to texturing, as its direct competitor was something of a monster. The RV740 doesn't cover all the lost ground, but it's in a considerably better position, nearly doubling RV670's numbers. However, in normal conditions you'll probably spend more time sampling from somewhat higher resolution surfaces, perhaps use at least bilinear filtering, so that's what's coming next. Sadly, the 8800GT didn't quite want to cooperate with our test, so its not present in the following charts (we deemed the numbers as unreliable):


We've also varied the texture's formats going from full 4-component ARGB down to a single component, to see if there aren't some odd quirks hidden somewhere under the hood -- luckily, this wasn't the case. Be aware that the test we're using here is different from the first one (it's OpenGL based versus Direct3D for the previous), and RV740 decided to be a bit naughty and not reach its theoretical limits even when sampling from cache, which was slightly annoying for us, given the fact that we're not doing anything special at all and using only very basic extensions like GL_ARB_multitexture. Oh well, c'est la vie! If you look at the patterns emerging, you'll notice the cache sizes being subtly suggested by the allure of the graph for each respective GPU.
Finally, a quick look at the impact of trilinear/anisotropic filtering. To mimic a somewhat closer to reality scenario we're using a (very simple) test scene: a multi-textured quad filling half the screen, rotating around origin:
filtering absolute performance bar chart/relative performance line chart here
The fact that the RV740 manages to catch up with the 8800GT at 16X AF, in spite of having less than half the addressing/filtering capability and only a 25% frequency advantage, is a testament to the merits of its cache. For 16X AF you're going to be fetching quite a few samples, and having good cache management that helps to hide the latency of fetching samples from memory is definitely a boon. We'll have a more comprehensive look at filtering performance and quality in a number of different scenarios in potential upcoming performance and IQ pieces.
Summing up this short sampling spree, it's fairly obvious that the RV740 is far more balanced on this front, compared to its forbearer. Whilst it's still not a sampling monster like the 8800GT (33.6 GTexels/s), it's considerably less prone to being stalled by inadequate capability in this area. Its caches seem quite robust, at least in the scenarios we analysed, and the memory controllers seem to be a tad better than the old ring-bus (some of the high-resolution texturing results suggest this).
After having sampled, it's time to mess with the sampled data in the shader core.