public inbox for gentoo-science@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-science] [Fwd: [atlas-devel] 3.7.14 out]
@ 2006-08-18  4:13 M. Edward (Ed) Borasky
  2006-08-18 12:33 ` Markus Dittrich
  0 siblings, 1 reply; 2+ messages in thread
From: M. Edward (Ed) Borasky @ 2006-08-18  4:13 UTC (permalink / raw
  To: gentoo-science

[-- Attachment #1: Type: text/plain, Size: 0 bytes --]



[-- Attachment #2: [atlas-devel] 3.7.14 out --]
[-- Type: message/rfc822, Size: 8162 bytes --]

From: Clint Whaley <whaley@cs.utsa.edu>
To: math-atlas-devel@lists.sourceforge.net
Subject: [atlas-devel] 3.7.14 out
Date: Thu, 17 Aug 2006 21:16:29 -0500
Message-ID: <200608180216.k7I2GTOE011785@pandora2.cs.utsa.edu>

Guys,

OK, 3.7.14 is out.  Here's the changelog:
   * Fixes/updates to ATLAS config system:
     - Improved cpu throttling probe
     - Added compiler test so only compilers that work are chosen from defaults
     - Added simple C interoperation test
     - Fixed frontend/backend tmpnam collision prob (config[1,0].tmp)
     - Re-enabled parallel make support
     - Fixed buildinfo support
     - Added clock speed probe to config
     - Enabled "make time" to produce performance summary!
     - Added "make check" as alias to "make test" to make more like gnu
       --- OOOps, this isn't there, screwed it up, sorry Kate --- RCW
     - Fixed error in -Si nof77 1, which caused config to die w/o f77 compiler
   * Added new arch defaults for P4E[32,64]SSE3 and HAMMER64SSE3, which get
     better performance for gcc 4.2 (perf should still be OK for gcc 3).

I'm guessing anybody with x86/linux or freebsd/OSX should be able to use
this guy, though I've only get arch defaults for the above 3 systems.  Note
that if you get the newest gcc from SVN, you can get 93% of double precision
theoretical peak using the code generator and the x87 unit on an AMD64!

The big news there is that the final part of the new
install is in place -- "make time" works!  Here's what happens if you do
an install yourself, or if your arch defaults don't have benchmark data:

*******************************************************************************
NAMING ABBREVIATIONS:
   kSelMM : selected matmul kernel (may be hand-tuned)
   kGenMM : generated matmul kernel
   kMM_NT : worst no-copy kernel
   kMM_TN : best no-copy kernel
   BIG_MM : large GEMM timing (usually N=1600); estimate of asymptotic peak
   kMV_N  : NoTranspose matvec kernel
   kMV_T  : Transpose matvec kernel
   kGER   : GER (rank-1 update) kernel
Kernel routines are not called by the user directly, and their
performance is often somewhat different than the total
algorithm (eg, dGER perf may differ from dkGER)


Clock rate=2800Mhz
               single precision        double precision
            *********************    ********************
               real      complex       real      complex
Benchmark   %   Clock   %   Clock   %   Clock   %   Clock
=========   =========   =========   =========   =========
  kSelMM       304.9      291.2      172.7      167.0
  kGenMM        94.0       92.7       89.6       89.0
  kMM_NT        75.2       75.6       67.9       70.5
  kMM_TN        90.9       89.8       79.3       80.5
  BIG_MM       290.1      275.0      164.6      162.2
   kMV_N        33.8      129.3       32.7       43.1
   kMV_T        49.3       46.4       30.8       44.7
    kGER        34.3       66.2       17.0       32.9
*******************************************************************************

*******************************************************************************
Here's what happens if you have arch default data to compare against:

                    single precision                  double precision
            ********************************   *******************************
                  real           complex           real           complex
            ---------------  ---------------  ---------------  ---------------
Benchmark   Refrenc Present  Refrenc Present  Refrenc Present  Refrenc Present
=========   ======= =======  ======= =======  ======= =======  ======= =======
  kSelMM      346.2   356.3    344.8   338.7    182.1   182.7    177.2   177.7
  kGenMM      177.7   182.7    144.4   154.6    169.0   154.9    150.8   173.4
  kMM_NT      138.6   134.5    144.4   145.4    118.4   114.9    131.5   134.4
  kMM_TN      141.5   154.9    135.9   142.5    144.9   134.5    142.0   145.4
  BIG_MM      328.9   334.8    320.4   320.7    169.9   171.4    174.3   175.6
   kMV_N       54.2    54.7    144.0   143.7     46.5    47.3     90.1    92.5
   kMV_T       59.2    63.5     74.3    75.4     41.9    42.8     54.1    53.9
    kGER       52.3    52.9    110.3   110.8     26.7    26.7     54.3    53.7

*******************************************************************************

The first times are a P4E64SSE3, and second HAMMER64SSE3.

BTW, with the help of the gcc guys, I figured out why gcc could never vectorize
any code, and my initial timings show you can get some speedup now.  For
AMD, the single precision speedup is significant.  Intel gets only modest
speedup.

However, right now, ATLAS can't enable gcc's vectorization, because the gcc
guys have made -funsafe-math-optimizations a mandatory flag to enable it, and
this flag ruins IEEE compliance, which we can't do.  I've got a bug report on
this at:
   http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28684

Cheers,
Clint

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Math-atlas-devel mailing list
Math-atlas-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/math-atlas-devel


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [gentoo-science] [Fwd: [atlas-devel] 3.7.14 out]
  2006-08-18  4:13 [gentoo-science] [Fwd: [atlas-devel] 3.7.14 out] M. Edward (Ed) Borasky
@ 2006-08-18 12:33 ` Markus Dittrich
  0 siblings, 0 replies; 2+ messages in thread
From: Markus Dittrich @ 2006-08-18 12:33 UTC (permalink / raw
  To: gentoo-science

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

Thanks for the note! "Unfortunately", starting with 3.7.12
upstream has significantly changed atlas' build system, which 
probably means that a major ebuild overhaul is needed. Hence, 
it might take a little until this guy hits portage. Would you
mind filing a bug for it so we can track any progress there?

Thanks,
Markus


- -- 
Markus Dittrich (markusle)
Gentoo Linux Developer
Scientific applications
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)

iD8DBQFE5bO0xlRwCwb7k40RAmwhAJ9AwncmSK8fROWzDW709Buo1ZtHdgCghwuq
EyDo9/pZCpbST/igrdUFoT4=
=0f2M
-----END PGP SIGNATURE-----
-- 
gentoo-science@gentoo.org mailing list



^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2006-08-18 12:34 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-08-18  4:13 [gentoo-science] [Fwd: [atlas-devel] 3.7.14 out] M. Edward (Ed) Borasky
2006-08-18 12:33 ` Markus Dittrich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox