Welcome, Unregistered.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Reply
Old 06-Jun-2012, 07:44   #1
Ethatron
Member
 
Join Date: Jan 2010
Posts: 375
ATI Released: squish-ccr (alpha version)

I'm proud to present squish as a compute-shader:


I implemented the concurrent "version" of the library in such a way that you can compile an AMP, a C++ or a DirectCompute version with the same code. This comes with a bit of overhead and it's not yet totally elegant.
I tried to maintain readability, though parallelization certainly muddies the clarity of an otherwise simple algorithm. As AMP has to include all functions (no "extern") it's now possible to wrap squish into namespaces, I'll post any code on request.
You can compile the AMP-implementation for CPU as well (USE_AMP_DEBUG).

The repository is here. The license remains MIT for now.

Compiling for AMP takes about 15 minutes (and crashes frequently), for Direct Compute 2 minutes I think. It's alpha, written down "theoretically optimal" and made to compile. I have just started verifying it's functionality, I'm sure a lot is broken.
I'm also sitting over the assembly, checking if I may do better on optimizations, a few reductions are corner cases ("1x condition + 1x4 thread-group acessing parallel insns" versus "4x1 register serial insns") and one has to count the beans ^H^H clocks to know which is better. Well with time it's going to be better I'm sure.

You're invited to hack it up, help fixing it etc. Or just wait until I unleash my BC7 compressor based on this.

Thanks for being such a great forum - I'd never been here, achieving something so complex, without all of you guys. Thanks a lot!
Ethatron is offline   Reply With Quote
Old 06-Jun-2012, 10:11   #2
Jawed
Regular
 
Join Date: Oct 2004
Location: London
Posts: 9,863
Send a message via Skype™ to Jawed
Default

I submitted a tiny change. EDIT: nope, lost in GIT noob land

But overall, here on my Win8+VS2011-beta+GSA1.59 system, it's badly broken.

Simple things first, here is what GSA says:

Code:
Input-array dimensionality has been set to 4
clusterfit.cpp(455,12): warning X3078: 'i': loop control variable conflicts with a previous declaration in the outer scope; most recent declaration will be used
clusterfit.cpp(483,12): warning X3078: 'i': loop control variable conflicts with a previous declaration in the outer scope; most recent declaration will be used
clusterfit.cpp(630,14): warning X3078: 'm': loop control variable conflicts with a previous declaration in the outer scope; most recent declaration will be used
clusterfit.cpp(632,14): warning X3078: 'm': loop control variable conflicts with a previous declaration in the outer scope; most recent declaration will be used
clusterfit.cpp(776,14): warning X3078: 'm': loop control variable conflicts with a previous declaration in the outer scope; most recent declaration will be used
clusterfit.cpp(778,14): warning X3078: 'm': loop control variable conflicts with a previous declaration in the outer scope; most recent declaration will be used
clusterfit.cpp(780,14): warning X3078: 'm': loop control variable conflicts with a previous declaration in the outer scope; most recent declaration will be used
colourset.cpp(167,8): warning X3575: loop termination conditions in varying flow control cannot depend on data read from a UAV, forcing loop to unroll
maths.cpp(404,8): warning X3575: loop termination conditions in varying flow control cannot depend on data read from a UAV, forcing loop to unroll
maths.cpp(525,13): warning X3571: pow(f, e) will not work for negative f, use abs(f) or conditionally handle negative values if you expect them
maths.cpp(527,13): warning X3571: pow(f, e) will not work for negative f, use abs(f) or conditionally handle negative values if you expect them
maths.cpp(442,8): warning X3575: loop termination conditions in varying flow control cannot depend on data read from a UAV, forcing loop to unroll
memory detected, consider making this write conditional.
VS says this on the squishtest project:

Code:
Error 1 error C3646: 'ccr_restricted' : unknown override specifier e:\my documents\github\squish-ccr\maths.h 478 1 squish
over and over again. It seems unable to resolve the symbol and for some reason reads it as the literal override specifier "ccr_restricted". I tried #defining USE_AMP;USE_AMP_DEBUG in the project's properties but that made no difference.

Not sure what to do next.
__________________
Can it play WoW?
Jawed is offline   Reply With Quote
Old 06-Jun-2012, 22:43   #3
Ethatron
Member
 
Join Date: Jan 2010
Posts: 375
Default

> "memory detected, consider making this write conditional."

That is some race-condition error, it can be evaded (as I did in the latest push), but I have to find out what the compiler is thinking there. The compiler isn't very detailed (and the message in GPU Shader Analyzer is cut off BTW) about what he thinks is going on. Sometimes fxc will compile, but GPUSA not, it's a bit shakey that program.

So, just try the HLSL now, it should compile.

> "Error 1 error C3646: 'ccr_restricted' "

Yes, I still have to bring all my bridging preprocessor stuff for the CPU-plain version from DDSopt over. I use the CPU-plain version in VS2010, it'll just be a few hours to bring that up-to date.

I added AMP instructions to the wiki, it's essentially the same way to include the files as in DirectCompute. Most of the mess stems from that you have to get all of the source in to compile for AMP, but that's diametrically to normal C/C++ practice, and your linker will run crazy. I hope we may get something more elegant (than what's in the wiki) with time.

Edit: updated, and added solutions for VS2005 to VS2012 (2010 & 2012 with 64bit platform target). I'm going to go ahead and make an exemplary AMP-project for VS2012.

Last edited by Ethatron; 06-Jun-2012 at 23:29.
Ethatron is offline   Reply With Quote
Old 07-Jun-2012, 11:07   #4
Jawed
Regular
 
Join Date: Oct 2004
Location: London
Posts: 9,863
Send a message via Skype™ to Jawed
Default

I had worked out a solution to the ccr_restricted problem. I needed to put a #define in the squish project's properties, instead of squishtest's. Irrelevant now, though.

I am having terrible trouble with GIT, especially cos I'm a noob with it. I am using the brand-spanking new GIT for Windows client and I can't sync commits because it tells me they aren't staged (which is how I ended up with the mess of branches, before). I suspect this is a bug but I don't want to turn this into a help me I'm a GIT noob thread.

I can't even make the GIT web pages do what I want. I wanted to make my fork match your most recent master, but couldn't see any way to do that. So I destroyed my fork, re-forked from your master and I can now open the HLSL file in GSA, and it compiles.

The VS2011 solution loads cleanly here, but it won't build. The solution/project doesn't have an explicit include path set for libpng. I suppose you have an environment variable configured...
__________________
Can it play WoW?
Jawed is offline   Reply With Quote
Old 07-Jun-2012, 23:48   #5
Ethatron
Member
 
Join Date: Jan 2010
Posts: 375
Default

Oh, no, I never use(d) the squish examples. But I have libpng in my VS-installation directory, so I probably never would have a problem with including and linking. If you want I can give you compiled png/zlib libs and includes for the VS of your choice, I'd put that under "Downloads" there on Github.

I use TortoiseGit, all the stuff you can do is available in the context menu. I guess you have SSH-key and all set up ...? Actually it takes a bit of time to understand the meaning of those available actions, it's not always context-free, I'm not using all of it.

I'm going to make a "squishamp" project soon, which hopefully can be used to debug/verify it's correct code (USE_DEBUG_AMP). Otherwise, I'm open to any suggestion how to step through AMP/DirectCompute code which doesn't involve W8. At the end DEBUG isn't able to "produce" fe. race condition problems as it's all serialized, so one day or the next we have to get to the GPU-code anyway.

I started doing all the ground-work for BC7/BC6H currently as well, half of the modes should be trivial, the other (with anchor-index) should be difficult. I wonder which tool to use to verify correctness of the created files, Compressonator I think can't load it ...

Lot's of edges at the bleeding technology ...
Ethatron is offline   Reply With Quote
Old 08-Jun-2012, 13:58   #6
Jawed
Regular
 
Join Date: Oct 2004
Location: London
Posts: 9,863
Send a message via Skype™ to Jawed
Default

Quote:
Originally Posted by Ethatron View Post
Oh, no, I never use(d) the squish examples. But I have libpng in my VS-installation directory, so I probably never would have a problem with including and linking. If you want I can give you compiled png/zlib libs and includes for the VS of your choice, I'd put that under "Downloads" there on Github.
I don't think it's necessary to put stuff into the git, and overall I'm trying to stay out of your way.

I got hold of libpng and zlib and made them sister directories of squish.

I think, with the magic of GIT, I can make a local change to solve this problem (#include "..\..\libpng\png.h") and just never include it in a pull I send your way.

libpng solves this kind of problem (i.e. in referencing zlib) by making zlib part of the solution and making the path to the zlib source a property of the zlib project ("zlib.props" file). But I suspect you're not bothered and squish has been around for long enough without taking care of this.

Quote:
I use TortoiseGit, all the stuff you can do is available in the context menu. I guess you have SSH-key and all set up ...? Actually it takes a bit of time to understand the meaning of those available actions, it's not always context-free, I'm not using all of it.
Funnily enough a patch to the client today has fixed the problem I had yesterday, i.e. I can now sync a local commit up to the hosted repo, without first making a branch for it. Glad it wasn't me.

This client does a load of things automatically, including the SSH thing, all because I started on the git website and it offered the "get going in Windows" option. I'll persevere with it...

Quote:
I'm going to make a "squishamp" project soon, which hopefully can be used to debug/verify it's correct code (USE_DEBUG_AMP). Otherwise, I'm open to any suggestion how to step through AMP/DirectCompute code which doesn't involve W8. At the end DEBUG isn't able to "produce" fe. race condition problems as it's all serialized, so one day or the next we have to get to the GPU-code anyway.
Total noob as far as squish is concerned...
__________________
Can it play WoW?

Last edited by Jawed; 08-Jun-2012 at 14:19. Reason: simpler include
Jawed is offline   Reply With Quote
Old 17-Sep-2012, 21:55   #7
Ethatron
Member
 
Join Date: Jan 2010
Posts: 375
Default

As I am adding BC6/7 support to squish I detected a problem with the available standard BC7-coders and worked around the issue for the upcoming squish version. I think anyone should be aware of it.
Ethatron is offline   Reply With Quote
Old 10-Oct-2012, 08:26   #8
Ethatron
Member
 
Join Date: Jan 2010
Posts: 375
Default

I added an alpha-fit method which reduces the square-error of alpha-channel compression by 50% on average. The performance-impact is not very large. Certainly not as large as cluster-fit (60% vs. 4.5% of program execution for cluster-fit and alpha-fit respectively).

It's binary search based, and takes at most 2*log2(max-min) iterations to figure out an almost optimal line through a 1D point cloud. I'm going to develop some graphs to show how much near optimality it is. Search for an optimal coding would take ((max-min) * (max-min-1))/2, or converging ((max-min)^2)/2.
Ethatron is offline   Reply With Quote

Reply

Tags
amp, directcompute, dxt

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 21:56.


Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.