Introduction


NVIDIA has just announced the GeForce 256, otherwise known as the NV10. With that, we've become very excited because not only does it deliver some very high fill-rates, but it also is one of the first chips to include T&L (transforming and lighting), along with S3's just announced Savage2000. While the GeForce and the Savage2000 unquestionably look very cool, we've had a lot of questions come to mind as well as some possible answers to those questions. NVIDIA  and S3 seem to have left a lot of blank spaces and we want those spaces filled. As many of you know, Beyond 3D tries to give not only the most accurate, but also the most complete information. Because of this, we feel some issues should be addressed and so that is what we'll attempt to do. While the issues of T&L apply to both NVIDIA and S3, the other aspects we'll discus apply specifically to the GeForce 256.

We were able to have an early look at the technical briefs that NVIDIA released the other day about Transformations and Lighting, Cube Environment Mapping and AGP 4x with Fast Writes. As always, these papers try to make things sound great and thus there is need for some questions and critical comments. Now before we start, I want to stress that it's not our goal to make the new product sound bad in any way. We just think we need to realize that these papers are part of the marketing machine and thus subject to exaggerations and too much optimism. And more information always leads to more questions. On top of that, I'd like to know how things work and the best way to find out is by questioning the content of the papers: Why do they do this? How would it work? What happens if we do this? So feel free to add your own input: questions, comments and of course answers are welcome by email or even better in the forum. The forum is here and the registration for the forum here.

Transformations and Lighting


Transformations and Lighting, or T&L, is a technology with two sides and all too often only the positive side is shown. This positive side is represented by the huge possible increase in detail, but this increase in detail comes with a cost and can have a negative impact on the efficiency and performance of the 3D rendering core.

The basic idea of T&L is to accelerate the transformations between the various "spaces" and the light calculations. To get a brief idea what these different "spaces" are, we'll give you a summery of each. The first one is known as world space. This is basically is where the 3D objects are held for the environment. The eye space is used for lighting and culling. Finally, the screen space is where the information is able to be held within the board's frame-buffer. Basically, all these transforms are a lot of mathematics, mathematics that have to be carried out on all the vertices of the triangles that form a 3D object. Now these transformations are translations (moving an object around in space while keeping it parallel to it original position), rotations (turning an object around in space, so no longer parallel) and scaling (making the object larger or smaller). These operations are needed to represent object correctly in a 3D world. For example, assume we have a racing game and in this game we have some scenery, lets say a house. Now somewhere we have a unified description of how this house looks. So basically, we have a huge list of positions of triangles and all these triangles combined form the 3 dimensional house. Now we need to place that unified house in the 3D world (equal to a landscape with a road and some cars). Now its pretty obvious that we need to translate and rotate the house so that it sits on the right position in the landscape, we also need to scale it so that the house looks bigger than a car, yet smaller than the skyscraper further down the road. So these 3 operations are the basics needed to make an object fit into the 3D world. and these operations have to be carried out for every vertex of every triangle of every object in the scene. Since the same mathematics are needed each time, its very easy to predict how many (Transformations) you can handle each clock cycle. Basically, you can say that x transforms are done per clock. This x might be larger or smaller than one depending on how good the hardware transform engine is.