Benchmarking Issues

Gabe’s presentation then turned to what has become over the past year as the thorny issue of benchmarking.

The following is a list of the types of “optimisations” that Valve see as being used within benchmarks that raise concerns:

  • Camera path specific occlusion Culling (such as the clip planes in 3DMark03)
  • Visual Quality Reductions (e.g. lowered filtering quality, as seen with recent UT2003 optimisations, or missing fog)
  • Rendering the scene differently (correctly) when a screen grab is detected
  • Lower precision rendering than the application initially requested (also seen with 3DMark03)
  • Detection of Shader code and replacement with alternative code
  • Scene specific Z writes (again, as seen with 3DMark03)
  • Benchmark specific drivers that are not publicly available
  • Fragile application and version specific optimisations

Although Gabe never stated it explicitly it does appear to be fairly evident that many of the comments were pointed in the direction of NVIDIA, as many of these issues have been found in NVIDIA’s drivers for 3DMark03 and UT2003. When asked specifically about this Gabe made reference to the fact that he was speaking at an ATI presentation!

For a development house, especially with the obvious importance in the industry as Valve, to speak out in such a manner is fairly shocking as developers rarely feel compelled to speak out about such issues. It begs the question: why have Valve decided to do this now? Not least because benchmarking seemed to have been of little concern to Valve with the original Half Life.

It would seem that the issue, as far as Valve sees things, is that under the raw terms there is such a large difference between the two current DirectX9 architectures that people need accurate data to make their hardware purchasing decisions on. Benchmarking is in reality one of the few ways consumers are able to make their buying decisions and Valve’s issue is that with the types of optimisations they have mentioned the benchmark results become so disassociated from the actual in-game performance that they are not giving much in the way of a true reflection of performance. Rather than being an issue for the hardware vendors this actually becomes an issue for the publishers and developers such as Valve – when a user sees a certain level of performance through benchmarks they would expect to see something similar when playing the game, so when they go out and purchase a game such as Half Life 2 and the game play performance is much lower than they expect, it’s Valve that will lose money when the game is returned because the user feels it is “poorly coded”. Half Life 2 will also receive updates via “Steam”, so new shader code could be given to users which will not have been “optimised” in NVIDIA’s drivers and hence be lower performing than the general performances users would expect just by looking at benchmarks.

Valve’s customers are not likely to just be the end users, but other game developers that wish to utilise the Half Life 2 game engine for their titles. Again other developers may look to the benchmarks expecting one level of performance, only to see that the performance is much lower when they start utilising their own code with it and may come to the erroneous conclusion that this is the fault of the engine.

Some have suggested that the strength of the comments that Valve have spoken may lose them customers. However, they may see the issue in that if they don’t speak up about this now and make people aware of what the pure DirectX9 performance is without optimised code then they could lose more later if everyone expects the engine performance to be as high as what is seen in the benchmarks. This may have just been about getting a level of awareness out there to such that if there is a large difference in performance between the benchmark results and the final end user performance, this is not an issue with the Half Life 2 game or engine.