Conclusion

Having looked at R520 it's clear to see the key points that ATI have concentrated on: Shader Model 3.0, architectural efficiency and image quality.

When looking at ATI's Shader Model 3.0 implementation, especially in light of their marketing line of "Shader Model 3.0 Done Right", there is inevitably going to be some contention over the Vertex Shader 3.0 implementation with its lack of Vertex Texturing. ATI's point here is that they felt it wasn't useful in this generation of hardware due to the performance it would run at to just support it, or to use a significant chunk of transistor space, inevitably taking away from somewhere else, to support it in a more effective fashion. Taking a look at ATI's XBOX 360 "Xenos" graphics processor we can certainly see that unified architectures are going to be far more conducive to the use of Vertex Texturing - the unified architecture will have been built to inherently cope with texture latencies, thus solving many of the Vertex Texturing performance implications, and all the texture capabilities available to the pixel processing also become available to the vertex shader, inclusive of wide and flexible texture format support, mip-mapping and filtering; current architectures just aren't there yet. The debate about SM3.0's requirement for Vertex Texturing or not, though, will inevitably rumble on, but assuming the X1000 series passes WHQL then it would appear that Microsoft will abide by ATI's decision.

It is clear where ATI have put their efforts into Shader Model 3.0, and what they evidently feel is the main distinguishing feature of this graphics programming model: Dynamic Branching in the Pixel Shader. With batch sizes of 16 pixels the performance implications of branching within the Pixel Shader are lessened to quite a serious degree, so this makes the capability of supporting branching in the Pixel Shader much greater - how much this will spur developers on to utilise the capability will have to be seen yet.

Some might argue that the costs for supporting such a high granularity in the batches for branching are fairly large, transistor wise, but it seems that the batch sizes, at least, were almost a convenient by-product of a path they had already set upon with their first Shader Model 2.0 part. ATI were the first to increase the pixel pipeline parallelism with R300 by including 2 quad engines, which later increased to 4 with R420, but rather than tying these quad engines together they set them up to operate on different tiles of the screen, hence completely different and unrelated programs. While this processing system does have its trade-offs, such as a reduction in texture cache coherency, it did allow ATI to easily scale the pipelines without re-architecting everything, giving rise to elements such as SuperTiling for multi-graphics processing, this is also the route of where ATI's small batch size support stems from.

Going on to the next point of architectural efficiency, we can again take this many ways. In terms of power draw and heat output it's certainly no more efficient than its predecessors, although the X1800 XT, with the highest power draw, does also house an extra 256MB of high speed RAM over any other board. Its certainly not more efficient in terms of transistor usage, seeing that it consumes twice the transistor of its predecessors and when measured against competitive parts at similar clock rates doesn't seem more efficient.

Where it is fairly clear to see that R520 does achieve high efficiency is in its performance relative to its predecessors. Comparing the performances of the similarly specified X1800 XL to the X800 XT we can see that there are performance gains of over 40% in some gaming applications, and that's with exactly the same memory bandwidths and the same theoretical pixel processing rates. One element that isn't clear from this, though, is exactly how much improvement the batch dispatcher is bringing, outside of the branching capabilities, as pure pixel shader performances do not change that much - without FSAA the differences between the boards are slimmer; the tests seem to indicate that texture efficiency has certainly improved. With this in mind, and the fact that R520 doesn't increase the number of math processors in the fragment pipelines, it does appear that R520 is leaning in its clockspeed advances, at least in the form of the XT, in order to significantly increase the pixel shader throughput; perhaps this is one area that could have been beneficial to have received more focus.

Many of the performance enhancing efficiency improvements appear to stem from the new memory controller, although given that, from the die shot, this appear to be the single largest consumer of transistors on R520 you would think that they should be getting some benefits from it. The controller, and other elements linked to it, have certainly improved the performance where FSAA is concerned, as there are stark improvements on R520 even when the bandwidth available doesn't change. This memory controller is designed to last for some time, with capabilities for GDDR4 support already present, and so for this first incarnation in R520 it does seem a little over architected for its use.

Finally, we come to the image quality improvements. First ATI have enabled Adaptive FSAA, bringing them closer in-line to NVIDIA's similar Transparency AA offering, but coupled with ATI's flexible FSAA mechanism. ATI have also added the new angle invariant Anisotropic Filtering, that will provide a more balanced and clearer image when off-angle textures are utilised in a scene. And while ATI were some way behind NVIDIA in providing floating point blending, now that they finally have provided it they have done so with the use of MSAA as well, allowing users more flexibility.

Although R520 has suffered from a lengthy and troublesome gestation period, even at this point it represents an interesting choice. While performance and price wise it may not offer a clear advantage over competitive choices, and nobody will be disappointed with either high end offering, X1800 does offer some extra options to the end user that they can benefit from immediately, but it's up to the developers to see how they go with Shader Model 3.0 to see if ATI's focus on dynamic branching really takes off. The final remaining question is that of availability - while X1800 XL's are already on sale, we're still waiting on the final versions of X1800 XT.

 


 

  • If you wish to comment on this article please do so here.

Other related aticles: