Welcome, Unregistered.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Reply
Old 04-Apr-2012, 01:09   #76
rpg.314
Senior Member
 
Join Date: Jul 2008
Location: /
Posts: 4,070
Send a message via Skype™ to rpg.314
Default

I don't think it is SRAM.
__________________
The views presented here are my own and not my employer's.
Quote:
Originally Posted by Alexko View Post
So in a nutshell, model [BLANK] will have [BLANK], up to [BLANK], and even [BLANK] for a power consumption of just [BLANK]. Impressive.
rpg.314 is offline   Reply With Quote
Old 04-Apr-2012, 06:35   #77
Ninjaprime
Member
 
Join Date: Jun 2008
Posts: 335
Default

Quote:
Originally Posted by rpg.314 View Post
I don't think it is SRAM.
Really? Seems tiny if its DRAM on an interposer, I wouldn't think they would need an interposer at all if it was only 64MB DRAM. I mean, in 2013, they should have 4gbit DRAM chips, and 2gbit has a raw die size of ~55mm^2 nowadays IIRC? Thats gotta be below 10mm^2 for 64MB DRAM, whats the interposer for?

Unless of course this info is completely wrong and its much more than 64MB... I was expecting 512MB or 1GB stacked next to the die, personally.
Ninjaprime is offline   Reply With Quote
Old 04-Apr-2012, 09:39   #78
hoho
Senior Member
 
Join Date: Aug 2007
Location: Estonia
Posts: 1,218
Send a message via MSN to hoho Send a message via Skype™ to hoho
Default

maybe it was 512MBytes but the article authors thought it was 512Mbits and converted it to 64MBytes.
hoho is offline   Reply With Quote
Old 04-Apr-2012, 11:16   #79
Paran
Member
 
Join Date: Sep 2011
Posts: 109
Default

Quote:
Originally Posted by hoho View Post
maybe it was 512MBytes but the article authors thought it was 512Mbits and converted it to 64MBytes.

That would be so epic fail, a typically Charlie.
Paran is offline   Reply With Quote
Old 04-Apr-2012, 23:30   #80
Albuquerque
Red-headed step child
 
Join Date: Jun 2004
Location: Guess ;)
Posts: 3,084
Default

And yet, would be epic badassery if there's 512MBytes of ram stacked on the die at full clock and a fat connection. I'm not counting on that, though...
__________________
"...twisting my words"
Quote:
Originally Posted by _xxx_ 1/25 View Post
Get some supplies <...> Within the next couple of months, you'll need it.
Quote:
Originally Posted by _xxx_ 6/9 View Post
And riots are about to begin too.
Quote:
Originally Posted by _xxx_8/5 View Post
food shortages and huge price jumps I predicted recently are becoming very real now.
Quote:
Originally Posted by _xxx_ View Post
If it turns out I was wrong, I'll admit being stupid
Albuquerque is offline   Reply With Quote
Old 05-Apr-2012, 01:05   #81
Ninjaprime
Member
 
Join Date: Jun 2008
Posts: 335
Default

Quote:
Originally Posted by Albuquerque View Post
And yet, would be epic badassery if there's 512MBytes of ram stacked on the die at full clock and a fat connection. I'm not counting on that, though...
That would be awesome. Though as I think about it, I'm starting to see the appeal of using SRAM even if it was only 64MB. Would essentially be a huge L4 cache that could be shared with the CPU/GPU, and could be used as a framebuffer the way Xbox360 uses its eDRAM. The low latency and huge bandwidth could make for some interesting efficiency gains in the pipeline...
Ninjaprime is offline   Reply With Quote
Old 05-Apr-2012, 01:13   #82
Alexko
Senior Member
 
Join Date: Aug 2009
Posts: 2,024
Send a message via MSN to Alexko
Default

Bear in mind that we're talking about a mainstream, notebook-oriented part, so cost and power are important concerns.
__________________
"Well, you mentioned Disneyland, I thought of this porn site, and then bam! A blue Hulk." —The Creature
My (currently dormant) blog: Teχlog
Alexko is offline   Reply With Quote
Old 05-Apr-2012, 02:28   #83
rpg.314
Senior Member
 
Join Date: Jul 2008
Location: /
Posts: 4,070
Send a message via Skype™ to rpg.314
Default

Quote:
Originally Posted by Ninjaprime View Post
Really? Seems tiny if its DRAM on an interposer, I wouldn't think they would need an interposer at all if it was only 64MB DRAM. I mean, in 2013, they should have 4gbit DRAM chips, and 2gbit has a raw die size of ~55mm^2 nowadays IIRC? Thats gotta be below 10mm^2 for 64MB DRAM, whats the interposer for?

Unless of course this info is completely wrong and its much more than 64MB... I was expecting 512MB or 1GB stacked next to the die, personally.
I was expecting a much larger amount as well. But SRAM seems too costly for that, even if made on an older process. DRAM or even eDRAM looks more likely than SRAM, IMO.

They seem to have traded off capacity for much larger bandwidth. I hope they are able to hit ~100GBps, if they are going with only 64MB. The comment over on die framebuffers to feed the display during low power states is also interesting. They would need ~8MB just for that. I am not sure where they will find room for that given 32nm duals have 4MB L3 total.
__________________
The views presented here are my own and not my employer's.
Quote:
Originally Posted by Alexko View Post
So in a nutshell, model [BLANK] will have [BLANK], up to [BLANK], and even [BLANK] for a power consumption of just [BLANK]. Impressive.
rpg.314 is offline   Reply With Quote
Old 13-May-2012, 21:45   #84
fellix
Senior Member
 
Join Date: Dec 2004
Location: Varna, Bulgaria
Posts: 2,819
Send a message via Skype™ to fellix
Default



Haswell most likely to have 50% more ALUs, according to the die-shot from an early running sample.
__________________
Apple: China -- Brutal leadership done right.
Google: United States -- Somewhat democratic.
Microsoft: Russia -- Big and bloated.
Linux: EU -- Diverse and broke.
fellix is offline   Reply With Quote
Old 13-May-2012, 22:50   #85
Paran
Member
 
Join Date: Sep 2011
Posts: 109
Default

Who says that this is a Haswell die?

According to Hiroshige Goto Haswell GT2 has 80 ALUs while GT3 has 160.

http://pc.watch.impress.co.jp/img/pc...tml/8.jpg.html
Paran is offline   Reply With Quote
Old 14-May-2012, 01:47   #86
rpg.314
Senior Member
 
Join Date: Jul 2008
Location: /
Posts: 4,070
Send a message via Skype™ to rpg.314
Default

Quote:
Originally Posted by Paran View Post
Who says that this is a Haswell die?
Nobody.
__________________
The views presented here are my own and not my employer's.
Quote:
Originally Posted by Alexko View Post
So in a nutshell, model [BLANK] will have [BLANK], up to [BLANK], and even [BLANK] for a power consumption of just [BLANK]. Impressive.
rpg.314 is offline   Reply With Quote
Old 20-May-2012, 14:55   #87
AnarchX
Senior Member
 
Join Date: Apr 2007
Posts: 1,394
Default

Kaveri with GDDR5?
http://www.donanimhaber.com/islemci/...re-gidiyor.htm
AnarchX is offline   Reply With Quote
Old 21-May-2012, 01:52   #88
liolio
French frog
 
Join Date: Jun 2005
Location: France
Posts: 4,172
Default

Quote:
Originally Posted by AnarchX View Post
Sorry but I don't understwnd that language, what does they said?

Amd bought a memory company not that long ago they could produce gddr5 modules.

I don't think all the memiry would be gdd5 but a module could be gddr5.
basivally kaveri could support two 128 bit buses one for gddr5 and one the ddr3.

Last edited by liolio; 22-May-2012 at 08:18.
liolio is offline   Reply With Quote
Old 21-May-2012, 05:31   #89
Raqia
Member
 
Join Date: Oct 2003
Posts: 320
Default

Quote:
Originally Posted by liolio View Post
Sorry but I don't understwnd that language, what does they said?

Amd bought a memory company not that long ago they could produce gddr5 modules.

I don't think all the memiry would be gdd5 but a module could be gddr5.
basivally kaveri could support two 128 bit busesn for gddr5 and one the ddr3.
Two buses wouldn't make sense for a consumer level part like Kaveri, it'd be very expensive and most consumers don't need the flexibility of being able to upgrade.

They should just make Kaveri boards w/ GDDR5 soldered on (much like a graphics board) and they'd much better memory bandwidth by using a fat bus and a memory controller similar to what they use in GPUs instead of using a bottlenecked dimm interface. I can see lower end parts having 8 GB and higher end parts 16 GB w/ 100+ GB/s compared to ~30 GB/s today.
Raqia is offline   Reply With Quote
Old 22-May-2012, 08:25   #90
liolio
French frog
 
Join Date: Jun 2005
Location: France
Posts: 4,172
Default

Well Density for GDDR5 memory chips is not the same as for DDR3. Is more than 4GB doable? Which bus width?
Extra latencies would impact negatively the CPU performance, no?

Two buses would fit easily in llano/trinity. Even a 64Bits bus would helps granting the chip with extra ~30GB/s of bandwidth.
liolio is offline   Reply With Quote
Old 02-Sep-2012, 10:18   #91
AnarchX
Senior Member
 
Join Date: Apr 2007
Posts: 1,394
Default

GenX (7.5) GT4 @14nm (Broadwell) with up to 2 TFLOPs?
http://www.inpai.com.cn/doc/hard/180995.htm

HSW GT3 could be in the 1 TFLOPs range.
AnarchX is offline   Reply With Quote
Old 03-Sep-2012, 01:17   #92
rpg.314
Senior Member
 
Join Date: Jul 2008
Location: /
Posts: 4,070
Send a message via Skype™ to rpg.314
Default

Sounds fake, if you ask me.

7970 is <4T today. And it eats >200W.

EDIT: surreal ->fake
__________________
The views presented here are my own and not my employer's.
Quote:
Originally Posted by Alexko View Post
So in a nutshell, model [BLANK] will have [BLANK], up to [BLANK], and even [BLANK] for a power consumption of just [BLANK]. Impressive.
rpg.314 is offline   Reply With Quote
Old 03-Sep-2012, 03:36   #93
Alexko
Senior Member
 
Join Date: Aug 2009
Posts: 2,024
Send a message via MSN to Alexko
Default

Quote:
Originally Posted by rpg.314 View Post
Sounds fake, if you ask me.

7970 is <4T today. And it eats >200W.

EDIT: surreal ->fake
The 7970M (Pitcairn) gets 2,176 GFLOPS in 100W, on 28nm, or 21.76 GFLOPS/W.

Ideal scaling would take that to 43.52 GFLOPS/W on 20nm, and 87.04 on 14nm.

So you'd only need <12W to get to 1TFLOPS on 14nm. Of course, ideal scaling is a pipe dream these days, but with a power budget of ~75W for a desktop chip and full, aggressive bi-directional power management, it seems doable.

They'd need to do something big about memory bandwidth, though.
__________________
"Well, you mentioned Disneyland, I thought of this porn site, and then bam! A blue Hulk." —The Creature
My (currently dormant) blog: Teχlog
Alexko is offline   Reply With Quote
Old 03-Sep-2012, 04:15   #94
rpg.314
Senior Member
 
Join Date: Jul 2008
Location: /
Posts: 4,070
Send a message via Skype™ to rpg.314
Default

Quote:
Originally Posted by Alexko View Post
The 7970M (Pitcairn) gets 2,176 GFLOPS in 100W, on 28nm, or 21.76 GFLOPS/W.

Ideal scaling would take that to 43.52 GFLOPS/W on 20nm, and 87.04 on 14nm.

So you'd only need <12W to get to 1TFLOPS on 14nm. Of course, ideal scaling is a pipe dream these days, but with a power budget of ~75W for a desktop chip and full, aggressive bi-directional power management, it seems doable.

They'd need to do something big about memory bandwidth, though.
I thought the high GT parts were for mobile.
__________________
The views presented here are my own and not my employer's.
Quote:
Originally Posted by Alexko View Post
So in a nutshell, model [BLANK] will have [BLANK], up to [BLANK], and even [BLANK] for a power consumption of just [BLANK]. Impressive.
rpg.314 is offline   Reply With Quote
Old 03-Sep-2012, 06:24   #95
rpg.314
Senior Member
 
Join Date: Jul 2008
Location: /
Posts: 4,070
Send a message via Skype™ to rpg.314
Default

Quote:
Originally Posted by Alexko View Post
The 7970M (Pitcairn) gets 2,176 GFLOPS in 100W, on 28nm, or 21.76 GFLOPS/W.

Ideal scaling would take that to 43.52 GFLOPS/W on 20nm, and 87.04 on 14nm.

So you'd only need <12W to get to 1TFLOPS on 14nm. Of course, ideal scaling is a pipe dream these days, but with a power budget of ~75W for a desktop chip and full, aggressive bi-directional power management, it seems doable.

They'd need to do something big about memory bandwidth, though.
The interposer tech is supposed to be on the way with Haswell. Let's hope they can provide lots of bandwidth by then.

GPU performance of that order could eliminate the need for discrete GPUs altogether, even for hi DPI displays.
__________________
The views presented here are my own and not my employer's.
Quote:
Originally Posted by Alexko View Post
So in a nutshell, model [BLANK] will have [BLANK], up to [BLANK], and even [BLANK] for a power consumption of just [BLANK]. Impressive.
rpg.314 is offline   Reply With Quote
Old 03-Sep-2012, 15:07   #96
homerdog
hardly a Senior Member
 
Join Date: Jul 2008
Location: still camping with a mauler
Posts: 3,637
Default

Quote:
Originally Posted by Alexko View Post
The 7970M (Pitcairn) gets 2,176 GFLOPS in 100W, on 28nm, or 21.76 GFLOPS/W.

Ideal scaling would take that to 43.52 GFLOPS/W on 20nm, and 87.04 on 14nm.

So you'd only need <12W to get to 1TFLOPS on 14nm. Of course, ideal scaling is a pipe dream these days, but with a power budget of ~75W for a desktop chip and full, aggressive bi-directional power management, it seems doable.

They'd need to do something big about memory bandwidth, though.
Remember you're comparing 28nm TSMC to 22nm Intel. In that situation I wouldn't be surprised if Intel could have ideal scaling compared to the TSMC manufactured stuff.

Memory bandwidth is always the larger problem with iGPU, and it isn't clear how (or if) there is a plan to solve that. Although I hear there may be some crazy EDRAM type stuff coming down the pipe. The question then is how to make use of it staying in the D3D API, or if each iGPU architecture will require custom code to make use of it.
__________________
Quote:
Originally Posted by Humus View Post
Releasing a game in 2010 without AA is a completely foreign concept to me. If the technique you're using makes it impossible to use AA then you're using the wrong techniques. As simple as that. Releasing a PC game without AA options is OK only if that means you can only have it enabled[...]
homerdog is offline   Reply With Quote
Old 04-Sep-2012, 17:48   #97
Blazkowicz
Senior Member
 
Join Date: Dec 2004
Location: Toulouse
Posts: 4,144
Default

Quote:
Originally Posted by rpg.314 View Post
I was expecting a much larger amount as well. But SRAM seems too costly for that, even if made on an older process. DRAM or even eDRAM looks more likely than SRAM, IMO.

They seem to have traded off capacity for much larger bandwidth. I hope they are able to hit ~100GBps, if they are going with only 64MB. The comment over on die framebuffers to feed the display during low power states is also interesting. They would need ~8MB just for that. I am not sure where they will find room for that given 32nm duals have 4MB L3 total.
with an interposer the ram chip has to be customly built I'd think. so the density of what you have readily available with ddr3 and ddr4 is less relevant.

I think Intel kept things simple for themselves, they are strong at making SRAM and it's done at every process, it has to be low cost too and released quickly while being some of the latest tech, maybe this is the first memory-on-interposer mass product?
Blazkowicz is online now   Reply With Quote
Old 04-Sep-2012, 18:03   #98
Blazkowicz
Senior Member
 
Join Date: Dec 2004
Location: Toulouse
Posts: 4,144
Default

Quote:
Originally Posted by Raqia View Post
Two buses wouldn't make sense for a consumer level part like Kaveri, it'd be very expensive and most consumers don't need the flexibility of being able to upgrade.

They should just make Kaveri boards w/ GDDR5 soldered on (much like a graphics board) and they'd much better memory bandwidth by using a fat bus and a memory controller similar to what they use in GPUs instead of using a bottlenecked dimm interface. I can see lower end parts having 8 GB and higher end parts 16 GB w/ 100+ GB/s compared to ~30 GB/s today.
I could see this on mid range, high end laptops, and then on a few micro ATX, mini ITX mobos with a generous (for the seller) price tag.

two buses would just make your chip and socket much more expensive, such wide CPU are made and sold on socket G34 and 2011.
a memory controller, or a pair of 64bit ones that support both gddr5 and ddr3 would be more reasonable and is like the CPU I'm using, which supports both ddr2 and ddr3.
Blazkowicz is online now   Reply With Quote
Old 05-Sep-2012, 03:45   #99
rpg.314
Senior Member
 
Join Date: Jul 2008
Location: /
Posts: 4,070
Send a message via Skype™ to rpg.314
Default

Quote:
Originally Posted by Blazkowicz View Post
with an interposer the ram chip has to be customly built I'd think. so the density of what you have readily available with ddr3 and ddr4 is less relevant.

I think Intel kept things simple for themselves, they are strong at making SRAM and it's done at every process, it has to be low cost too and released quickly while being some of the latest tech, maybe this is the first memory-on-interposer mass product?
I am pretty sure wherever this memory is going to be, it's going to be some sort of dram.
__________________
The views presented here are my own and not my employer's.
Quote:
Originally Posted by Alexko View Post
So in a nutshell, model [BLANK] will have [BLANK], up to [BLANK], and even [BLANK] for a power consumption of just [BLANK]. Impressive.
rpg.314 is offline   Reply With Quote
Old 05-Sep-2012, 05:01   #100
DavidC
Member
 
Join Date: Sep 2006
Posts: 273
Default

Quote:
Originally Posted by Alexko View Post
Ideal scaling would take that to 43.52 GFLOPS/W on 20nm, and 87.04 on 14nm.
Ideal scaling died years ago. Since few process generations ago, Intel claimed ~20% increase in performance OR >30% reduction of power usage at same performance. It's likely same for other vendors as well.

I don't even know how Haswell will improve performance even by 2x on the 15W Ultrabook parts. If Anandtech's measurements are accurate, it takes ~4W for the CPU core, 9W for iGPU and 4W for the rest of the CPU(in typical games). If you want to bring that down to 15W, and double iGPU performance you need:

18W performance down to 7W, or 2.6x the improvement in perf/watt. Ivy Bridge is said by Intel to use 2x performance/watt at same performance level as Sandy Bridge, but does not reach the same 2x perf/watt at greater performance. Obviously because they had to use a combination of lowered clocks and process tech advancement to achieve 0.5x the power usage. How will they do that on higher performance?
DavidC is offline   Reply With Quote

Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 12:28.


Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.