Since beginning this article both Futuremark and ATI have come out with their own comments, particularly with Futuremark addressing some of NVIDIA's issues. As a portion of this was article written prior to FutureMark's response some of our thoughts may echo what has been written in Futuremark's response.
Before we go on, here are several links to some pertinent documents:
NVIDIA's main issue with the benchmark is that they claim this does not represent actual gaming scenarios, and make a number of assertions to back up those claims. Lets look at a few of those assertions...
Game Test 1
The first game test, "Wings of Fury", as we've noted a number of times, uses a single texture "Skybox", quad texture layers on the planes and point sprites and quad particle effects for the explosion and smoke trail effects.
In their critique, NVIDIA state the following regarding this particular test:
"Unfortunately, Futuremark chose a flight simulation scene for this test. This genre of games is not only a small fraction of the game market (approximately 1%), but utilizes a simplistic rendering style common to this genre. Further, the specific scene chosen is a high altitude flight simulation, which is indicative of only a small fraction of that 1%. For any given frame in the scene, up to 90% of the pixels in the frame are single textured. This occurs because the majority of the scene is the low poly-count, single textured "skybox", painted to look like sky and clouds.The documentation for 3DMark03 gives some detail about the four-layer multitexture used on the airplanes. Regrettably, all this effort is lost on the fact that the airplanes cover so few pixels on the screen to make these four layers of multitexture completely insignificant. Game Test 1 is, essentially, a single texture fill rate test. No modern games, even DX7 games, are completely dominated by this kind of simple rendering technique."
They state that 90% of the rendered pixels are single textured and that the multi-textured planes cover very little each frame. Well, to some degree we can analyse this, thanks to the rather unique configurations for ATI's Radeon 9700 and 9500 PRO products.
Both the Radeon 9700 and 9500 PRO operate at the same clock speeds for the memory and core, and they both feature 8 pixel pipelines, with one texture unit per pipe (which would seem to be the ideal configuration for GT1 if it were mainly single textured pixels). The key difference between the two products is that the 9500 PRO has exactly half the bandwidth of 9700 due to the latter featuring a 256-bit memory bus and the former only a 128-bit bus. Because of the relative lack of bandwidth the 9500 PRO product is actually unable to render all of its pixel pipes into memory every clock cycle, in 32-bit, single texturing cases, and so its single texturing fill-rate is less than half of its theoretical maximum. Here are the theoretical fill-rate number of these two boards taken from our recent 9700 review:
Pixel Fill (Mpp/s) | Texel Fill (Mtp/s) | |
9700 | 1537.7 | 2169.3 |
9500 PRO | 596.1 | 2115.6 |
% Difference | -61% | -2% |
The difference in single pure texturing performance is -61%, which means that if the GT1 is single textured 90% of the time then we'd expect the performance difference between the 9500 PRO and 9700 to be in the region of about -50%. Here are the results of this test from the 3DMark03 performance article:
640x480 | 800x600 | 1024x768 | 1280x1024 | 1600x1200 | |
9700 | 191.6 | 168.9 | 133.5 | 101.3 | 80.8 |
9500 PRO | 175.7 | 145.2 | 110.7 | 81.2 | 61.6 |
% Difference | -8% | -14% | -17% | -20% | -24% |
It appears that the worst case (most bandwidth limited) performance difference is in the region of -24%, which would give the impression that the percentage of single texture pixels drawn is not 90% and that the other elements of the benchmark are accounting for a greater proportion of the render time than is being given credit for.
Now, although this analysis may appear to be sound on the face of it, in fact it's not, for two major reasons. First, the background in GT1 is actually generated by rendering a hemisphere with two quads for the ground and the sky; the clouds are alpha blended and do not update the Z-Buffer, while the rest of the opaque surfaces do. The net result of this is that the bandwidth required to produce the backgrounds in the test is less than that of the fill-rate test which updates the Z-Buffer and blends every pixel. Another point to take into consideration is that the R300 chip of the 9500 PRO/9700 is unable to do single cycle Trilinear filtering, meaning that it is taking two cycles to produce a Trilinear filtered pixel, which will offset the bandwidth limitation as 9500 PRO will not be attempting to output 8 pixels per cycle.
Given these issues, and the 24% performance differential, it would actually seem to indicate that the scene is actually dominated by single textured pixels, but what should be asked is whether this is unrepresentative of many titles?
Even if we ignore the titles that this test is attempting to represent (which is a small proportion of the overall PC gaming market), as some are likely to use similar rendering techniques, many games will still utilise single textured effects. If we take the example, where volume layers need to be represented (such as layers of clouds, smoke, explosions, etc.) many will be represented by laying down multiples of single layered textures. For example, the sky in Quake 3 uses three layers of single texture pixels, and smoke trails will be rendered using a series of single textured quads in a line. There are newer methods of rendering volume effects, especially via the use of shaders, yet these can be both expensive to use in a gaming scenario on current hardware and complex to implement.
It's also true to say that not all games will use multi-texturing. For instance, the 2002 title, Jedi Knight II: Jedi Outcast, which uses the multi-texturing capable Quake3 engine, would appear not to use multi-texturing at all. If we take a look at the Anisostropic filtering performance on a fill-rate graph for Jedi Knight II it is virtually identical to the normal Trilinear performance, which would suggest that multi-texturing is not being utilised in this title.
Refer to Futuremarks response document for their reasons for including this test.