Technical explanation of the problem

These uniformly colored zones are no big surprise. These artifacts are a result of the screen-capture technique used by Quake III Arena. I’ll explain step-by-step what happens when Quake III Arena takes a screenshot on the VSA-100 hardware.

As you probably know, the VSA-100 T-Buffer technology uses 4 sub-buffers which contain the sub-samples that are combined to form the final Anti-aliased image. If you don’t know this then you should read the T-Buffer and Full-Scene Anti-aliasing white papers.

Now the whole problem follows from this buffer combining. Each buffer contains 16 bit pixel values. 16 bit color equals to 5-6-5 which means 5 bits for red and blue, and 6 bits for green (has to do with the fact that the human eye is more sensitive to shades of green). Now what we have is thus four 16 bit values (a color is just a number for a computer) that are being combined using the following formula:

Final Color = (Buffer 1 + Buffer 2 + Buffer 3 + Buffer 4) / 4

This formula defines a simple averaging filter. Now if you have a number and you divide it by four then there are 4 possible fractional parts: 0, 25, 50 and 75. For example:

32 / 4 = 8,00 Fractional part is 00
33 / 4 = 8,25 Fractional part is 25
34 / 4 = 8,50 Fractional part is 50
35 / 4 = 8,75 Fractional part is 75

These are the only possibilities that exist, and these fractional parts show the gain in bits. Let me explain: when you divide 32 by 4 using integer accuracy then you find 8, when you divide 33 by 4 you also get 8 but when you divide 34 you suddenly get 9, at least if we assume rounding of the numbers (rounding means that fractional part equal or above 0.50 are rounded to the next integer value). If however we assume truncation (means you just cut away the fractional part) then you get again 8. The same is true for 35, which gives you 9 when you use rounding and 8 when you use truncation. The basic result for both methods is that 4 numbers (32, 33, 34 and 35) are matched with the same integer result (8) in the case of truncation. In the case of rounding the same is true but for different base values (30,31,32 and 33). So basically we “lose” accuracy. There are 4 base numbers but only 1 possible integer accuracy output! But exactly how much accuracy do we gain when we combine 4 numbers and take the average? Well we gain 4 possible fractional cases 0, 25, 50 and 75. Four extra cases translate to 2 bits extra accuracy. This is very obvious, since to encode 4 extra cases you need 2 bits: 00, 01, 10 and 11. So based on this we know that this combine operation delivers us 2 extra bits. This gain is per color channel since every color channel is a number. As I said before, 16 bit equals to 5-6-5 but because of the combine and average operation we get 7-8-7, 2 extra bits for every color channel. And 7-8-7 equals to 22 bit color accuracy!

Now how does this influence the screen capturing?

What happens when you do a screen capture is that the V5500 hardware reads the data from all 4 T-Buffers and combines them automatically. Essentially the result for each pixel is a 22-bit number at the hardware level. The RAMDAC has no problems displaying a 22-bit color number since it’s designed to handle up to 24-bit color (32-bit mode actually but the 8 alpha bits are not used when displaying, they are used for internal rendering math). So when you normally display the image you see the full quality 22bit image, but when capturing this doesn’t work. While the hardware has 22-bit accuracy internally the capture software expects a 16-bit output image (remember we are in 16-bit color mode, so why would the software expect 22-bit output!). Thus, what happens is that the color values get truncated; the extra bits are being thrown away by the hardware/software. So essentially you just gained 2 bits per color channel but you lose them because the software doesn’t expect them, what you lose is the extra fine detail. Now the result of truncation, as explained before, is that base numbers get projected on the same output number. In terms of colors this means that different colors at the input become the same color at the output. Thus what happens is that uniform blobs of color appear, the small detail is LOST. And this is EXACTLY what we see in the Quake III Arena screenshots published all over the web and used for image quality comparisons. Is this a correct comparison? Of course not, the screenshot does not match the true on-screen image quality, so the comparison cannot be valid!

In 32bit mode the same problem remains, theoretically the on-chip accuracy will be higher than 8-8-8, more precisely 10-10-10 but neither RAMDACs nor File Formats support this extra color depth at this moment. So the visual difference in 24/32bit color mode is minimal at least when comparing with other products running also running in 32bit mode. The bit accuracy lost in 32bit mode is also much smaller than the bits lost in 16bit mode (you lose finer detail, you could say its so little detail that you don’t notice it).

How do we solve this?

Well the problem is that the internal and on-screen accuracy is 22bits, we thus need a screen capture program that knows this and thus expects 22bits values as input from the hardware instead of the 16bit the normal programs expect right now. So the best solution would be that a screen capture program just takes all the data and does no truncation at all, thus effectively outputting a 22 or 24-bit image while the hardware runs in 16 bit mode. The output would be 24 bits since 22bits file formats are not really used. Another solution would be to take the 22bit numbers and reduce them to 16bit but using dithering, using dithering you can approach the image quality of the 22bits, of course there would still be some loss since 16bit dithered is not equal to true 22bits of accuracy. Actually, dithering trades pixel accuracy for color accuracy, resulting in a loss of image sharpness.