Benchmarks - Theoretical Rates

Before going on to look at any actual benchmarks scores we'll take a look at the theoretical rates of the boards we're looking at in this article:

6200 300 1200 1200 225 275 8.8
6600 GT 500 2000 4000 375 500 16.0
5200 Ultra 325 1300 1300 81 325 10.4
 
6600 GT -40.0% -40.0% -70.0% -40.0% -45.0% -45.0%
5200 Ultra -7.7% -7.7% -7.7% 176.9% -15.4% -15.4%

With a reduction in both core clock rate of 40% from the 6600 GT the 6200's fill-rate and Vertex throughput is down similar levels, however with internal pipelines also disabled the texture rate is further reduced - bear in mind that this will also directly affect Pixel Shader throughput. The 6200 has the same memory bus with as the 6600 GT, at 128-bit, so the bandwidth differences only differ dependant on clock rate, which is a little over half.

Looking at the top level metric differences between the 5200 Ultra and the 6200 it would appear that the 5200 Ultra would have the measure of 6200 as for all but the Vertex Shader it would appear to have higher throughputs. However, both these parts utilise fairly distinctive internal pipeline organisations, bit both are very different from each other. Although 5200 Ultra looks to have 4 pixel pipelines, each with a single texture unit, only under limit circumstance could it be utilised in this fashion due to the nature of the Pixel Shader structure. In this instance, although its not shown by these specifications, the 6200 should have much larger Pixel Shader processing capabilities.

Fill-Rates

For the first test we'll take a look at some of the key fill-rate characteristics of the boards on test here:

 

 

6200 709.1 386.0 211.3 148.6 113.2 92.2 77.1 66.7
6600 GT 1297.0 903.6 571.9 406.5 315.2 258.7 219.5 189.1
5200 Ultra 641.3 340.2 178.9 150.7 45.2 38.0 31.3 27.5
 
6600 GT -45.3% -57.3% -63.1% -63.4% -64.1% -64.4% -64.9% -64.7%
5200 Ultra 10.6% 13.5% 18.1% -1.4% 150.3% 142.6% 146.3% 142.9%

Due to the organisation of the pipelines in NV43, with two fragment quads but only capabilities for outputting 4 pixels, we see that the difference between the the single and dual texturing fill-rate performances is fairly small as under single texturing internally it is able to render 8 textured pixels, but these are being bottlenecked by the output which can only write 4 colour values to memory per clock - with two, or more layers, each internal pipeline is taking multiple cycles which allows each quad of pixels to be outputted over multiple cycles at no overall performance loss. With the NV34 pipeline layout of 5200 Ultra under single, fixed function, texturing it is able to put all 4 of its pipelines to use, however as soon as more textures per pixel are required it behaves that the two pixel pipeline with two texture samplers per pipe chip (2x2), which is why you see the performance plateau at 3-4 & 5-6 layers.

6200 displays neither of the characteristics of 6600 GT and 5200 Ultra, instead displaying a fairly uniform drop in performance for each texture layer applied. This highlights that only one of the fragment quads from NV43 has been disabled which means that it behaves just like a part with 4 pipelines and one texture unit per pipeline (4x1).

6200 1197.9 2277.5 906.0 606.5
6600 GT 2088.8 4109.8 2086.2 2086.2
5200 Ultra 656.8 654.3 644.2 656.8
 
6600 GT -42.7% -44.6% -56.6% -70.9%
5200 Ultra 82.4% 248.1% 40.6% -7.7%

Looking at the fill-rate performance of some other basic pixel operations we see that the 6200 is behaving in a fairly straightforward manner. As with the rest of the NV4x, and NV3x, chips the 6200 has capabilities for writing two Z and Stencil values per clock, so the Z Fill-rate is approaching twice that of its standard colour fill-rate. The Single Texture Alpha blend performance is also in-line with that of its colour-write performance, when we factor in the bandwidth required for this operation, indicating it has the hardware capabilities to blend 4 pixels per cycle, just as it can write 4 pixels per cycle. Likewise the Floating Point texture performance is falling at about half the pixel fill-rate performance meaning that it is taking two cycles for each sample, and this is the same as the other NV4x based parts.