ATI RV740 GPU and Architecture analysis

Saturday 26th September 2009, 01:36:00 PM, written by Rys

After too long a hiatus, we're back with a new GPU and architecture analysis! ATI's 40nm marvel, RV740, is under Alex's microscope in Radeon HD 4770 form.  Bringing almost everything that made RV770 (and now RV870) great to the lower end of the market, does RV740 ultimately impress?
Discuss on the forums

Tagging

ati ± radeon, rv740, hd, 4770, 128, bit


Latest Thread Comments (13 total)
Posted by rpg.314 on Saturday, 26-Sep-09 14:53:56 UTC
From here

http://www.beyond3d.com/content/reviews/52/9

Quoting B3D news
In a single clock, any instruction can read from at most 3 distinct GPR addresses due to read port restrictions.
In the code posted here,

http://forum.beyond3d.com/showpost.php?p=1322632&postcount=1

IN the instruction 35, I see 5 registers being read. Are there differences between RV740 and RV770's shaders?

Posted by LordEC911 on Saturday, 26-Sep-09 19:23:08 UTC
Dang, another review to read...
Haven't even got through all the 58x0 reviews yet.

Posted by AlStrong on Saturday, 26-Sep-09 19:27:30 UTC
Are you complaining about having to read a *Beyond3D *article.:shocked: :wink3: :runaway: :runaway: :runaway:Although if *recent* history serves *correctly* *cough*, you'll have plenty of time to catch up until the next one. :wink4:

Posted by wishiknew on Saturday, 26-Sep-09 21:59:57 UTC
I haven't checked the front page in a year and I hit jackpot!

Posted by mczak on Sunday, 27-Sep-09 02:02:04 UTC
Quoting rpg.314
IN the instruction 35, I see 5 registers being read. Are there differences between RV740 and RV770's shaders?
Not that I know of. But I think you're interpreting that wrong, docs say storage is separated per element. Hence you can access 3 .x components, 3 different .y components etc. (with some limitations) which is fulfilled by that instruction (x 2/3/13, y 3/13, z 3/16, w 0/3/16).

Posted by trinibwoy on Sunday, 27-Sep-09 21:42:11 UTC
Nice job guys, keep em coming.

Posted by Jawed Errors on page 9 on Sunday, 27-Sep-09 23:28:44 UTC
Quote
Integer and float instructions can't be processed in parallel.
Both types can share an instruction group.
Quote
The transcendental unit (the Rys unit!) is different from its more silhouette conscious brethren: it's (surprisingly!!!) capable of handling transcendentals (cos, sin, log, exp, rcp et al.) at a rate of 1/cycle, INT MUL, due to a slightly higher internal precision (40-bit versus 32-bit, allowing expression of a 32-bit int in the FP exponent) than the other ALUs, and format conversions, all whilst not being able to process dot products or double precision work (*so it's idle when double precision processing is happening*).
T unit is fully functional while XY or ZW or XYZW unit-combinations are executing double-precision instructions. T is only subject to operand bandwidth/register-file porting.
Quote
Work-assignment for the ALUs is also asymmetrical, the slim ALUs being tasked first, with the T ALU being the last to be issued an instruction, except when the instruction group contains transcendental or other instructions only it can execute, in which case it is immediately issued the corresponding instruction.
Order of work-assignment (XYZW versus T) is actually an option in the hardware: CONFIG.ALU_INST_PREFER_VECTOR.
Quote
The transcendental ALU shares GPR read ports with the other ALUs, so it can either load a needed operand in a single cycle if and only if one of the slim ALUs loads the same operand, otherwise it has to wait until such a time when an unused read port is available.
T: can also use any of the in-pipe registers: PV and PS, or any type of constant. The operand fetching algorithm and resulting constraints are byzantine and not worth the effort comprehending (unless you're writing a compiler or are after the last 1% of performance).
Quote
Writes are owner-exclusive (only the owner thread can write to the owned location), reads are shared (all other threads can read any location).
"Owner-exclusive" is merely an option. Though I've seen one document that suggests otherwise, it's battling both the R700 Family ISA and the Intermediate Language Specification (which, admittedly, could be considered "forward looking", beyond R700). Jawed

Posted by Freak'n Big Panda on Monday, 28-Sep-09 14:56:19 UTC
Holy crap. amazing. Keep it up rys, I'm hoping for a 870 review.

Posted by Richard on Monday, 28-Sep-09 16:16:47 UTC
This is Alex's fault. :eek:

Posted by mboeller on Thursday, 08-Oct-09 10:32:19 UTC
Quoting AlStrong
Are you complaining about having to read a *Beyond3D *article.

Although if *recent* history serves *correctly* *cough*, you'll have plenty of time to catch up until the next one. :wink4:

not this time:

http://www.beyond3d.com/content/reviews/53/1

two for the prize of one ;)


Add your comment in the forums

Related ati News

ATI 5830 launched, baffled looks follow
ATI Cypress Gaming Performance Analysis
ATI Catalyst 10.1 Display Driver
ATI Radeon HD 5670 released, bringing DX11 for less than $100
ATI 5970 comes out to play, completes ATI's lineup
AMD OpenCL development platform for CPU and GPU
ATI Cypress GPU and architecture analysis
ATI Radeon HD 5870 released, powered by new DX11 GPU
ATI Radeon HD 4890 launched at $250 with improved GPU
ATI Mobility Radeon HD 4860 and 4830 appear on 40nm