Welcome, Unregistered.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Reply
Old 25-Jul-2013, 10:47   #1
Plamensito
Junior Member
 
Join Date: Jul 2013
Location: EU, Bulgaria
Posts: 24
Default [Performance] : ARM Mali T628 MP6




As already mentioned Samsung announces Exynos 5420 Octa. The processor will be driven by the Mali T628 graphics with 6 clusters.

The Mali T604, T624 and T628 is based 16FP and 2 Vec4.


Mali T628 in Exynos 5420 contains 6 clusters. Productivity is calculated as follows:

16FP x 2 Vec4 x 6 Clusters x 0.533MHz = 102.336 GFLOPS

As announced by Samsung, Mali T628 MP6 is 2 times more productive than PowerVR SGX544MP3 (Exynos 5410 Octa):

4USSE2 x 4 MAD's x 2 ALU x 3 MP x 0.533 = 51.168

Now look table for Performance offscreen :

Note: PowerVR SGX554MP4 has additional scalar (х 1.125)

Note 2 : More for Adreno 330 :
http://www.359gsm.com/forum/viewtopic.php?f=127&t=13152

Plamensito is offline   Reply With Quote
Old 25-Jul-2013, 11:27   #2
Ailuros
Epsilon plus three
 
Join Date: Feb 2002
Location: Chania
Posts: 8,494
Default

Hmm interesting twist if true on the ALU lane count for the S600 Adreno 320 and Adreno330.

As for the rest the math results are correct you just have quite a complicated way of calculating things.

I'm not sure if Mali T6xx has Vector ALUs and not SIMDs; I'd like to think it's the latter for all newer generation GPUs. In that case it makes my life easier to think for a T628MP6@533MHz:

6 * SIMD16 * 2 FLOPs * 0.533GHz = 102.34 GFLOPs

Oh and by the way for accuracy's sake if you're going to count probably SFU FLOPs for SGX554 you should also count them for something like the T604 (1 SFU/SIMD16 afaik). In that case the T604 is actually at a theoretical peak of ~72GFLOPs.
__________________
People are more violently opposed to fur than leather; because it's easier to harass rich ladies than motorcycle gangs.
Ailuros is offline   Reply With Quote
Old 25-Jul-2013, 11:34   #3
Rys
Tiled
 
Join Date: Oct 2003
Location: Abbots Langley, UK
Posts: 2,745
Default

T6xx isn't SIMD.
__________________
Mr. Popples!
Rys is offline   Reply With Quote
Old 25-Jul-2013, 11:38   #4
Plamensito
Junior Member
 
Join Date: Jul 2013
Location: EU, Bulgaria
Posts: 24
Default

Quote:
Originally Posted by Ailuros View Post
Hmm interesting twist if true on the ALU lane count for the S600 Adreno 320 and Adreno330.

As for the rest the math results are correct you just have quite a complicated way of calculating things.

I'm not sure if Mali T6xx has Vector ALUs and not SIMDs; I'd like to think it's the latter for all newer generation GPUs. In that case it makes my life easier to think for a T628MP6@533MHz:

6 * SIMD16 * 2 FLOPs * 0.533GHz = 102.34 GFLOPs

Oh and by the way for accuracy's sake if you're going to count probably SFU FLOPs for SGX554 you should also count them for something like the T604 (1 SFU/SIMD16 afaik). In that case the T604 is actually at a theoretical peak of ~72GFLOPs.
Yes, The Mali T604 add + 1TMU = 72.4GFLOPS.

The table does not include the additional 1 TMU (8x2+1)
Plamensito is offline   Reply With Quote
Old 25-Jul-2013, 11:41   #5
Ailuros
Epsilon plus three
 
Join Date: Feb 2002
Location: Chania
Posts: 8,494
Default

Quote:
Originally Posted by Rys View Post
T6xx isn't SIMD.
I stand corrected. What a shame.
__________________
People are more violently opposed to fur than leather; because it's easier to harass rich ladies than motorcycle gangs.
Ailuros is offline   Reply With Quote
Old 25-Jul-2013, 11:45   #6
tangey
Senior Member
 
Join Date: Jul 2006
Location: 0x5FF6BC
Posts: 1,032
Default

T628MP6=150% of T604MP4 at same frequency. So T628 has no extra compute power core for core over T604 ?

The graph in the link has a somewhat higher figure for T604 than the graph on this thread ?
__________________
Check out my blog charting my challenge to have a maximum value rewards holiday:-
http://www.goingonrewards.com

Last edited by tangey; 25-Jul-2013 at 12:01.
tangey is offline   Reply With Quote
Old 25-Jul-2013, 12:12   #7
Ailuros
Epsilon plus three
 
Join Date: Feb 2002
Location: Chania
Posts: 8,494
Default

Quote:
Originally Posted by tangey View Post
T628MP6=150% of T604MP4 at same frequency. So T628 has no extra compute power core for core over T604 ?

The graph in the link has a somewhat higher figure for T604 than the graph on this thread ?
Probably not; but I'd expect the 628 (see the diagram on ARM's site compared to a 624 f.e.) to have quite a few aspects doubled compared to 604.
__________________
People are more violently opposed to fur than leather; because it's easier to harass rich ladies than motorcycle gangs.
Ailuros is offline   Reply With Quote
Old 25-Jul-2013, 12:19   #8
Plamensito
Junior Member
 
Join Date: Jul 2013
Location: EU, Bulgaria
Posts: 24
Default

Mali T604 and T624 are the same.

http://arm.com/products/multimedia/m...pute/index.php

The same example with PowerVR SGX543 and SGX544, the same are.

Subject to different licenses.

I could be wrong, but the bills are
Plamensito is offline   Reply With Quote
Old 25-Jul-2013, 12:34   #9
Plamensito
Junior Member
 
Join Date: Jul 2013
Location: EU, Bulgaria
Posts: 24
Default

The Mali T628 MP6 can not keep up:

2x MMU 2x Level Cashe 2x AMBA 4 ACE Lite

Because it contains 8 cores. Only 8 cores is maintained above.

view graphic :

http://arm.com/products/multimedia/m...pute/index.php
Plamensito is offline   Reply With Quote
Old 25-Jul-2013, 13:08   #10
Ailuros
Epsilon plus three
 
Join Date: Feb 2002
Location: Chania
Posts: 8,494
Default

You don't happen to have any details on IMG Rogue variants do you?
__________________
People are more violently opposed to fur than leather; because it's easier to harass rich ladies than motorcycle gangs.
Ailuros is offline   Reply With Quote
Old 25-Jul-2013, 13:20   #11
Plamensito
Junior Member
 
Join Date: Jul 2013
Location: EU, Bulgaria
Posts: 24
Default

Quote:
Originally Posted by Plamensito View Post
The Mali T628 MP6 can not keep up:

2x MMU 2x Level Cashe 2x AMBA 4 ACE Lite

Because it contains 8 cores. Only 8 cores is maintained above.

view graphic :

http://arm.com/products/multimedia/m...pute/index.php

Assumption!!!

12USSE2

Plamensito is offline   Reply With Quote
Old 25-Jul-2013, 13:25   #12
Ailuros
Epsilon plus three
 
Join Date: Feb 2002
Location: Chania
Posts: 8,494
Default

Well we have time for that; let's stick to Mali T628 for the time being.
__________________
People are more violently opposed to fur than leather; because it's easier to harass rich ladies than motorcycle gangs.
Ailuros is offline   Reply With Quote
Old 25-Jul-2013, 14:23   #13
Plamensito
Junior Member
 
Join Date: Jul 2013
Location: EU, Bulgaria
Posts: 24
Default

Quote:
Originally Posted by Ailuros View Post
Hmm interesting twist if true on the ALU lane count for the S600 Adreno 320 and Adreno330.

As for the rest the math results are correct you just have quite a complicated way of calculating things.

I'm not sure if Mali T6xx has Vector ALUs and not SIMDs; I'd like to think it's the latter for all newer generation GPUs. In that case it makes my life easier to think for a T628MP6@533MHz:

6 * SIMD16 * 2 FLOPs * 0.533GHz = 102.34 GFLOPs

Oh and by the way for accuracy's sake if you're going to count probably SFU FLOPs for SGX554 you should also count them for something like the T604 (1 SFU/SIMD16 afaik). In that case the T604 is actually at a theoretical peak of ~72GFLOPs.
Yes, Mali T604 add +1 TMU (8x2+1) = 72.4 GFLOPS.
The table does not include the additional 1 TMU for Mali GPU, to be able to reveal the exact difference double between T628 MP6 (102.336) and SGX544MP3 (51.168)

If you need to add 1 TMU per T628 MP6 will look like this:
17 x 2 x 6 x 0.533 = 108.732 GFLOPS
Plamensito is offline   Reply With Quote
Old 25-Jul-2013, 18:05   #14
Ailuros
Epsilon plus three
 
Join Date: Feb 2002
Location: Chania
Posts: 8,494
Default

You mean SFU instead of TMU I guess.

Anyway if you want to count the SFU FLOPs for the SGX544MP3 in the Exynos5410 it should be:

Each ALU = Vec4 + 1 or else 9 FLOPs/ALU

3 cores * 4 ALUs * 9 FLOPs * 0.533GHz = 57.56 GFLOPs

Albeit I'd personally prefer IHVs or their respective marketing departments to not count things like SFU FLOPs into the arithmetic throughput.

As for Rogue I'd say it doesn't hurt to say that each cluster = SIMD16 + 2 TMUs

G6130 = 1*SIMD16
G62x0 = 2*SIMD16
G64x0 = 4*SIMD16
G6630 = 6*SIMD16
__________________
People are more violently opposed to fur than leather; because it's easier to harass rich ladies than motorcycle gangs.
Ailuros is offline   Reply With Quote
Old 25-Jul-2013, 19:34   #15
Plamensito
Junior Member
 
Join Date: Jul 2013
Location: EU, Bulgaria
Posts: 24
Default

Quote:
Originally Posted by Ailuros View Post
You mean SFU instead of TMU I guess.

Anyway if you want to count the SFU FLOPs for the SGX544MP3 in the Exynos5410 it should be:

Each ALU = Vec4 + 1 or else 9 FLOPs/ALU

3 cores * 4 ALUs * 9 FLOPs * 0.533GHz = 57.56 GFLOPs

Albeit I'd personally prefer IHVs or their respective marketing departments to not count things like SFU FLOPs into the arithmetic throughput.

As for Rogue I'd say it doesn't hurt to say that each cluster = SIMD16 + 2 TMUs

G6130 = 1*SIMD16
G62x0 = 2*SIMD16
G64x0 = 4*SIMD16
G6630 = 6*SIMD16
Yes, as I said in the header post are not taken into account for the additional scalar SGX544 MP3 and additional SFU for Mali GPU

With or without, the gap remains the same - x2 in favor of Mali T628MP6 than SGX544MP3.

Without additional scalar/SFU :

Mali T628 MP6 = 102.336
SGX544 MP3 = 51.168

With additional scalar/SFU :

Mali T628 MP6 = 108.732
SGX544 MP3 = 57.564

Regards
Plamensito is offline   Reply With Quote

Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 13:08.


Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.