From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from lists.gentoo.org ([140.105.134.102] helo=robin.gentoo.org) by nuthatch.gentoo.org with esmtp (Exim 4.60) (envelope-from ) id 1GDvkq-0007z8-5c for garchives@archives.gentoo.org; Fri, 18 Aug 2006 04:14:44 +0000 Received: from robin.gentoo.org (localhost [127.0.0.1]) by robin.gentoo.org (8.13.7/8.13.6) with SMTP id k7I4E1ar007236; Fri, 18 Aug 2006 04:14:01 GMT Received: from sccrmhc12.comcast.net (sccrmhc12.comcast.net [204.127.200.82]) by robin.gentoo.org (8.13.7/8.13.6) with ESMTP id k7I4E0Ov003800 for ; Fri, 18 Aug 2006 04:14:01 GMT Received: from [67.170.141.18] (c-67-170-141-18.hsd1.or.comcast.net[67.170.141.18]) by comcast.net (sccrmhc12) with ESMTP id <20060818041359012002k977e>; Fri, 18 Aug 2006 04:13:59 +0000 Message-ID: <44E53E79.6040108@cesmail.net> Date: Thu, 17 Aug 2006 21:13:45 -0700 From: "M. Edward (Ed) Borasky" User-Agent: Thunderbird 1.5.0.5 (X11/20060813) Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-science@gentoo.org Reply-to: gentoo-science@lists.gentoo.org MIME-Version: 1.0 To: gentoo-science@lists.gentoo.org Subject: [gentoo-science] [Fwd: [atlas-devel] 3.7.14 out] Content-Type: multipart/mixed; boundary="------------060506050901000008090702" X-Archives-Salt: bc046695-c679-48d4-927f-e25fab32265c X-Archives-Hash: 59c3ae9d35f4f505f5416d73d1a0e19e This is a multi-part message in MIME format. --------------060506050901000008090702 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit --------------060506050901000008090702 Content-Type: message/rfc822; name="[atlas-devel] 3.7.14 out" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="[atlas-devel] 3.7.14 out" X-Account-Key: account2 Return-Path: Delivered-To: cesmail-net-znmeb@cesmail.net Received: (qmail 20818 invoked from network); 18 Aug 2006 02:16:41 -0000 X-Spam-Checker-Version: SpamAssassin 3.1.1 (2006-03-10) on blade5 X-Spam-Level: X-Spam-Status: hits=0.0 tests=none version=3.1.1 Received: from unknown (192.168.1.103) by blade5.cesmail.net with QMQP; 18 Aug 2006 02:16:41 -0000 Received: from lists-outbound.sourceforge.net (66.35.250.225) by mx53.cesmail.net with SMTP; 18 Aug 2006 02:16:41 -0000 Received: from sc8-sf-list1-new.sourceforge.net (unknown [10.3.1.93]) by sc8-sf-spam2.sourceforge.net (Postfix) with ESMTP id 5410D12203; Thu, 17 Aug 2006 19:16:40 -0700 (PDT) Received: from sc8-sf-mx2-b.sourceforge.net ([10.3.1.92] helo=mail.sourceforge.net) by sc8-sf-list1-new.sourceforge.net with esmtp (Exim 4.43) id 1GDtuY-0004fo-UO for math-atlas-devel@lists.sourceforge.net; Thu, 17 Aug 2006 19:16:39 -0700 Received: from mail0.cs.utsa.edu ([129.115.29.4]) by mail.sourceforge.net with esmtp (Exim 4.44) id 1GDtuX-0003Fu-BM for math-atlas-devel@lists.sourceforge.net; Thu, 17 Aug 2006 19:16:38 -0700 Received: from pandora2.cs.utsa.edu (pandora2.cs.utsa.edu [129.115.29.24]) by mail0.cs.utsa.edu (Postfix) with ESMTP id 2B0972CA7; Thu, 17 Aug 2006 21:16:28 -0500 (CDT) Received: (from whaley@localhost) by pandora2.cs.utsa.edu (8.12.10+Sun/8.12.10/Submit) id k7I2GTOE011785; Thu, 17 Aug 2006 21:16:29 -0500 (CDT) From: Clint Whaley Message-Id: <200608180216.k7I2GTOE011785@pandora2.cs.utsa.edu> Date: Thu, 17 Aug 2006 21:16:29 -0500 To: math-atlas-devel@lists.sourceforge.net User-Agent: Heirloom mailx 12.0 3/4/06 MIME-Version: 1.0 Subject: [atlas-devel] 3.7.14 out X-BeenThere: math-atlas-devel@lists.sourceforge.net X-Mailman-Version: 2.1.8 Precedence: list Reply-To: "List for developer discussion, NOT SUPPORT." List-Id: "List for developer discussion, NOT SUPPORT." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: math-atlas-devel-bounces@lists.sourceforge.net Errors-To: math-atlas-devel-bounces@lists.sourceforge.net X-SpamCop-Checked: 192.168.1.103 66.35.250.225 10.3.1.93 10.3.1.92 129.115.29.4 129.115.29.24 Guys, OK, 3.7.14 is out. Here's the changelog: * Fixes/updates to ATLAS config system: - Improved cpu throttling probe - Added compiler test so only compilers that work are chosen from defaults - Added simple C interoperation test - Fixed frontend/backend tmpnam collision prob (config[1,0].tmp) - Re-enabled parallel make support - Fixed buildinfo support - Added clock speed probe to config - Enabled "make time" to produce performance summary! - Added "make check" as alias to "make test" to make more like gnu --- OOOps, this isn't there, screwed it up, sorry Kate --- RCW - Fixed error in -Si nof77 1, which caused config to die w/o f77 compiler * Added new arch defaults for P4E[32,64]SSE3 and HAMMER64SSE3, which get better performance for gcc 4.2 (perf should still be OK for gcc 3). I'm guessing anybody with x86/linux or freebsd/OSX should be able to use this guy, though I've only get arch defaults for the above 3 systems. Note that if you get the newest gcc from SVN, you can get 93% of double precision theoretical peak using the code generator and the x87 unit on an AMD64! The big news there is that the final part of the new install is in place -- "make time" works! Here's what happens if you do an install yourself, or if your arch defaults don't have benchmark data: ******************************************************************************* NAMING ABBREVIATIONS: kSelMM : selected matmul kernel (may be hand-tuned) kGenMM : generated matmul kernel kMM_NT : worst no-copy kernel kMM_TN : best no-copy kernel BIG_MM : large GEMM timing (usually N=1600); estimate of asymptotic peak kMV_N : NoTranspose matvec kernel kMV_T : Transpose matvec kernel kGER : GER (rank-1 update) kernel Kernel routines are not called by the user directly, and their performance is often somewhat different than the total algorithm (eg, dGER perf may differ from dkGER) Clock rate=2800Mhz single precision double precision ********************* ******************** real complex real complex Benchmark % Clock % Clock % Clock % Clock ========= ========= ========= ========= ========= kSelMM 304.9 291.2 172.7 167.0 kGenMM 94.0 92.7 89.6 89.0 kMM_NT 75.2 75.6 67.9 70.5 kMM_TN 90.9 89.8 79.3 80.5 BIG_MM 290.1 275.0 164.6 162.2 kMV_N 33.8 129.3 32.7 43.1 kMV_T 49.3 46.4 30.8 44.7 kGER 34.3 66.2 17.0 32.9 ******************************************************************************* ******************************************************************************* Here's what happens if you have arch default data to compare against: single precision double precision ******************************** ******************************* real complex real complex --------------- --------------- --------------- --------------- Benchmark Refrenc Present Refrenc Present Refrenc Present Refrenc Present ========= ======= ======= ======= ======= ======= ======= ======= ======= kSelMM 346.2 356.3 344.8 338.7 182.1 182.7 177.2 177.7 kGenMM 177.7 182.7 144.4 154.6 169.0 154.9 150.8 173.4 kMM_NT 138.6 134.5 144.4 145.4 118.4 114.9 131.5 134.4 kMM_TN 141.5 154.9 135.9 142.5 144.9 134.5 142.0 145.4 BIG_MM 328.9 334.8 320.4 320.7 169.9 171.4 174.3 175.6 kMV_N 54.2 54.7 144.0 143.7 46.5 47.3 90.1 92.5 kMV_T 59.2 63.5 74.3 75.4 41.9 42.8 54.1 53.9 kGER 52.3 52.9 110.3 110.8 26.7 26.7 54.3 53.7 ******************************************************************************* The first times are a P4E64SSE3, and second HAMMER64SSE3. BTW, with the help of the gcc guys, I figured out why gcc could never vectorize any code, and my initial timings show you can get some speedup now. For AMD, the single precision speedup is significant. Intel gets only modest speedup. However, right now, ATLAS can't enable gcc's vectorization, because the gcc guys have made -funsafe-math-optimizations a mandatory flag to enable it, and this flag ruins IEEE compliance, which we can't do. I've got a bug report on this at: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28684 Cheers, Clint ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Math-atlas-devel mailing list Math-atlas-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/math-atlas-devel --------------060506050901000008090702-- -- gentoo-science@gentoo.org mailing list