From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from lists.gentoo.org ([140.105.134.102] helo=robin.gentoo.org) by nuthatch.gentoo.org with esmtp (Exim 4.62) (envelope-from ) id 1HpADl-00023v-Ui for garchives@archives.gentoo.org; Fri, 18 May 2007 21:42:46 +0000 Received: from robin.gentoo.org (localhost [127.0.0.1]) by robin.gentoo.org (8.14.0/8.14.0) with SMTP id l4ILfuGr001893; Fri, 18 May 2007 21:41:56 GMT Received: from sccrmhc11.comcast.net (sccrmhc11.comcast.net [63.240.77.81]) by robin.gentoo.org (8.14.0/8.14.0) with ESMTP id l4ILftBJ001852 for ; Fri, 18 May 2007 21:41:55 GMT Received: from [71.236.188.93] (c-71-236-188-93.hsd1.or.comcast.net[71.236.188.93]) by comcast.net (sccrmhc11) with ESMTP id <200705182141530110061ph3e>; Fri, 18 May 2007 21:41:53 +0000 Message-ID: <464E1DA0.5030306@cesmail.net> Date: Fri, 18 May 2007 14:41:52 -0700 From: "M. Edward (Ed) Borasky" User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.2) Gecko/20070221 SeaMonkey/1.1.1 Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-science@gentoo.org Reply-to: gentoo-science@lists.gentoo.org MIME-Version: 1.0 To: gentoo-science@lists.gentoo.org Subject: [gentoo-science] [Fwd: [atlas-devel] Athlon64 X2 results] Content-Type: multipart/mixed; boundary="------------010901060608070204020300" X-Archives-Salt: 907179ed-d504-44a2-85aa-a7be3346ba3a X-Archives-Hash: b9d293cc6a2c75b84a95aef388739146 This is a multi-part message in MIME format. --------------010901060608070204020300 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit --------------010901060608070204020300 Content-Type: message/rfc822; name="[atlas-devel] Athlon64 X2 results.eml" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="[atlas-devel] Athlon64 X2 results.eml" X-Account-Key: account2 X-Mozilla-Keys: Return-Path: Delivered-To: cesmail-net-znmeb@cesmail.net Received: (qmail 22798 invoked from network); 14 May 2007 01:35:38 -0000 X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on blade2.cesmail.net X-Spam-Level: X-Spam-Status: hits=0.0 tests=none version=3.1.8 Received: from unknown (192.168.1.103) by blade2.cesmail.net with QMQP; 14 May 2007 01:35:38 -0000 Received: from lists-outbound.sourceforge.net (66.35.250.225) by mx53.cesmail.net with SMTP; 14 May 2007 01:35:38 -0000 Received: from sc8-sf-list1-new.sourceforge.net (sc8-sf-list1-new-b.sourceforge.net [10.3.1.93]) by sc8-sf-spam2.sourceforge.net (Postfix) with ESMTP id 4B55BF8F2; Sun, 13 May 2007 18:35:29 -0700 (PDT) Received: from sc8-sf-mx1-b.sourceforge.net ([10.3.1.91] helo=mail.sourceforge.net) by sc8-sf-list1-new.sourceforge.net with esmtp (Exim 4.43) id 1HnPRK-0004S5-0Q for math-atlas-devel@lists.sourceforge.net; Sun, 13 May 2007 18:33:31 -0700 Received: from rwcrmhc15.comcast.net ([204.127.192.85]) by mail.sourceforge.net with esmtp (Exim 4.44) id 1HnPRI-0006df-UW for math-atlas-devel@lists.sourceforge.net; Sun, 13 May 2007 18:33:29 -0700 Received: from [71.237.219.233] (unknown[71.237.219.233]) by comcast.net (rwcrmhc15) with ESMTP id <20070514012552m15007ff8ae>; Mon, 14 May 2007 01:25:52 +0000 Message-ID: <4647BA9F.80400@cesmail.net> Date: Sun, 13 May 2007 18:25:51 -0700 From: "M. Edward (Ed) Borasky" User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.2) Gecko/20070221 SeaMonkey/1.1.1 MIME-Version: 1.0 To: math-atlas-devel@lists.sourceforge.net Subject: [atlas-devel] Athlon64 X2 results X-BeenThere: math-atlas-devel@lists.sourceforge.net X-Mailman-Version: 2.1.8 Precedence: list Reply-To: "List for developer discussion, NOT SUPPORT." List-Id: "List for developer discussion, NOT SUPPORT." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: math-atlas-devel-bounces@lists.sourceforge.net Errors-To: math-atlas-devel-bounces@lists.sourceforge.net X-SpamCop-Checked: 192.168.1.103 66.35.250.225 10.3.1.93 10.3.1.91 204.127.192.85 71.237.219.233 71.237.219.233 X-SpamCop-Whitelisted: znmeb@cesmail.net I just got an Athlon64 X2 4200+ (and a motherboard/RAM/hard drive, etc.). It's taken me a week or so to get it stabilized, but I've got Gentoo loaded and just built "blas-atlas" and "lapack-atlas" on it. Here's what I got in the SUMMARY.LOG. Compiler is GCC 4.1.2 and the kernel is 2.4.21 (Gentoo). Questions: 1. Do the numbers look right for a dual-core 2210 MHz Athlon64? 2. Does this chip really have SSE3? The /proc/cpuinfo flags that Linux provides show SSE and SSE2, but not SSE3. ******************************************************************************* ******************************************************************************* ******************************************************************************* * BEGAN ATLAS3.7.30 INSTALL OF SECTION 0-0-0 ON 05/13/2007 AT 16:54 * ******************************************************************************* ******************************************************************************* ******************************************************************************* IN STAGE 1 INSTALL: SYSTEM PROBE/AUX COMPILE Level 1 cache size calculated as 64KB. dFPU: Combined muladd instruction with 5 cycle pipeline. Apparent number of registers : 32 Register-register performance=1691.12MFLOPS sFPU: Combined muladd instruction with 5 cycle pipeline. Apparent number of registers : 32 Register-register performance=1576.58MFLOPS IN STAGE 2 INSTALL: TYPE-DEPENDENT TUNING STAGE 2-1: TUNING PREC='d' (precision 1 of 4) STAGE 2-1-1 : BUILDING BLOCK MATMUL TUNE The best matmul kernel was ATL_dmm4x1x90_x87.c, NB=52, written by R. Clint Whaley Performance: 4030.58MFLOPS (182.38 percent of of detected clock rate) (Gen case got 2222.47MFLOPS) mmNN : ma=1, lat=4, nb=28, mu=4, nu=1 ku=28, ff=0, if=5, nf=1 Performance = 2182.04 (54.14 of copy matmul, 98.73 of clock) mmNT : ma=1, lat=4, nb=28, mu=4, nu=1 ku=28, ff=0, if=5, nf=1 Performance = 1957.55 (48.57 of copy matmul, 88.58 of clock) mmTN : ma=1, lat=8, nb=28, mu=4, nu=1 ku=28, ff=0, if=5, nf=1 Performance = 2063.71 (51.20 of copy matmul, 93.38 of clock) mmTT : ma=1, lat=5, nb=28, mu=4, nu=1 ku=28, ff=0, if=5, nf=1 Performance = 1808.69 (44.87 of copy matmul, 81.84 of clock) STAGE 2-1-2: CacheEdge DETECTION CacheEdge set to 2097152 bytes STAGE 2-1-3: LARGE/SMALL CASE CROSSOVER DETECTION STAGE 2-1-3: COPY/NO-COPY CROSSOVER DETECTION done. STAGE 2-1-4: LEVEL 3 BLAS TUNE done. STAGE 2-1-5: GEMV TUNE gemvN : chose routine 3:ATL_gemvN_1x1_1a.c written by R. Clint Whaley Yunroll=32, Xunroll=1, using 90 percent of L1 Performance = 848.61 (21.05 of copy matmul, 38.40 of clock) gemvT : chose routine 105:ATL_gemvT_2x16_1.c written by R. Clint Whaley Yunroll=2, Xunroll=16, using 90 percent of L1 Performance = 830.40 (20.60 of copy matmul, 37.57 of clock) STAGE 2-1-6: GER TUNE ger : chose routine 1:ATL_ger1_axpy.c written by R. Clint Whaley mu=16, nu=1, using 0.84 percent of L1 Cache Performance = 618.42 (15.34 of copy matmul, 27.98 of clock) STAGE 2-2: TUNING PREC='s' (precision 2 of 4) STAGE 2-2-1 : BUILDING BLOCK MATMUL TUNE The best matmul kernel was ATL_smm14x1x84_sse.c, NB=84, written by R. Clint Whaley Performance: 7779.10MFLOPS (352.00 percent of of detected clock rate) (Gen case got 1961.99MFLOPS) mmNN : ma=1, lat=4, nb=28, mu=4, nu=1 ku=28, ff=0, if=6, nf=1 Performance = 2020.39 (25.97 of copy matmul, 91.42 of clock) mmNT : ma=1, lat=7, nb=28, mu=4, nu=1 ku=28, ff=0, if=6, nf=1 Performance = 1708.96 (21.97 of copy matmul, 77.33 of clock) mmTN : ma=1, lat=4, nb=28, mu=4, nu=1 ku=28, ff=0, if=6, nf=1 Performance = 1984.23 (25.51 of copy matmul, 89.78 of clock) mmTT : ma=1, lat=7, nb=28, mu=4, nu=1 ku=28, ff=0, if=6, nf=1 Performance = 1683.02 (21.64 of copy matmul, 76.15 of clock) STAGE 2-2-2: CacheEdge DETECTION CacheEdge set to 2097152 bytes STAGE 2-2-3: LARGE/SMALL CASE CROSSOVER DETECTION STAGE 2-2-3: COPY/NO-COPY CROSSOVER DETECTION done. STAGE 2-2-4: LEVEL 3 BLAS TUNE done. STAGE 2-2-5: GEMV TUNE gemvN : chose routine 3:ATL_gemvN_1x1_1a.c written by R. Clint Whaley Yunroll=32, Xunroll=1, using 87 percent of L1 Performance = 1208.92 (15.54 of copy matmul, 54.70 of clock) gemvT : chose routine 101:ATL_gemvT_mm.c written by R. Clint Whaley Yunroll=0, Xunroll=0, using 87 percent of L1 Performance = 1258.14 (16.17 of copy matmul, 56.93 of clock) STAGE 2-2-6: GER TUNE ger : chose routine 1:ATL_ger1_axpy.c written by R. Clint Whaley mu=16, nu=1, using 0.97 percent of L1 Cache Performance = 1142.53 (14.69 of copy matmul, 51.70 of clock) STAGE 2-3: TUNING PREC='z' (precision 3 of 4) STAGE 2-3-1 : BUILDING BLOCK MATMUL TUNE The best matmul kernel was ATL_dmm4x1x90_x87.c, NB=60, written by R. Clint Whaley Performance: 3977.18MFLOPS (179.96 percent of of detected clock rate) (Gen case got 2140.87MFLOPS) mmNN : ma=1, lat=3, nb=24, mu=4, nu=1 ku=24, ff=0, if=5, nf=1 Performance = 2217.59 (55.76 of copy matmul, 100.34 of clock) mmNT : ma=1, lat=4, nb=24, mu=4, nu=1 ku=24, ff=0, if=5, nf=1 Performance = 2050.31 (51.55 of copy matmul, 92.77 of clock) mmTN : ma=1, lat=8, nb=24, mu=4, nu=1 ku=24, ff=0, if=5, nf=1 Performance = 2086.29 (52.46 of copy matmul, 94.40 of clock) mmTT : ma=1, lat=6, nb=24, mu=4, nu=1 ku=24, ff=0, if=5, nf=1 Performance = 1977.85 (49.73 of copy matmul, 89.50 of clock) STAGE 2-3-2: CacheEdge DETECTION CacheEdge set to 2097152 bytes zdNKB set to 0 bytes STAGE 2-3-3: LARGE/SMALL CASE CROSSOVER DETECTION STAGE 2-3-3: COPY/NO-COPY CROSSOVER DETECTION done. STAGE 2-3-4: LEVEL 3 BLAS TUNE done. STAGE 2-3-5: GEMV TUNE gemvN : chose routine 3:ATL_cgemvN_1x1_1a.c written by R. Clint Whaley Yunroll=32, Xunroll=1, using 87 percent of L1 Performance = 1710.40 (43.01 of copy matmul, 77.39 of clock) gemvT : chose routine 101:ATL_cgemvT_mm.c written by R. Clint Whaley Yunroll=0, Xunroll=0, using 87 percent of L1 Performance = 1047.50 (26.34 of copy matmul, 47.40 of clock) STAGE 2-3-6: GER TUNE ger : chose routine 1:ATL_cger1_axpy.c written by R. Clint Whaley mu=16, nu=1, using 0.76 percent of L1 Cache Performance = 1256.75 (31.60 of copy matmul, 56.87 of clock) STAGE 2-4: TUNING PREC='c' (precision 4 of 4) STAGE 2-4-1 : BUILDING BLOCK MATMUL TUNE The best matmul kernel was ATL_smm14x1x84_sse.c, NB=84, written by R. Clint Whaley Performance: 7579.67MFLOPS (342.97 percent of of detected clock rate) (Gen case got 1886.85MFLOPS) mmNN : ma=1, lat=2, nb=24, mu=4, nu=1 ku=24, ff=0, if=5, nf=1 Performance = 2038.76 (26.90 of copy matmul, 92.25 of clock) mmNT : ma=1, lat=5, nb=24, mu=4, nu=1 ku=24, ff=0, if=5, nf=1 Performance = 1673.32 (22.08 of copy matmul, 75.72 of clock) mmTN : ma=1, lat=5, nb=24, mu=4, nu=1 ku=24, ff=0, if=5, nf=1 Performance = 2006.47 (26.47 of copy matmul, 90.79 of clock) mmTT : ma=1, lat=2, nb=24, mu=4, nu=1 ku=24, ff=0, if=5, nf=1 Performance = 1773.72 (23.40 of copy matmul, 80.26 of clock) STAGE 2-4-2: CacheEdge DETECTION CacheEdge set to 2097152 bytes csNKB set to 0 bytes STAGE 2-4-3: LARGE/SMALL CASE CROSSOVER DETECTION STAGE 2-4-3: COPY/NO-COPY CROSSOVER DETECTION done. STAGE 2-4-4: LEVEL 3 BLAS TUNE done. STAGE 2-4-5: GEMV TUNE gemvN : chose routine 3:ATL_cgemvN_1x1_1a.c written by R. Clint Whaley Yunroll=32, Xunroll=1, using 87 percent of L1 Performance = 3235.79 (42.69 of copy matmul, 146.42 of clock) gemvT : chose routine 101:ATL_cgemvT_mm.c written by R. Clint Whaley Yunroll=0, Xunroll=0, using 87 percent of L1 Performance = 1215.66 (16.04 of copy matmul, 55.01 of clock) STAGE 2-4-6: GER TUNE ger : chose routine 1:ATL_cger1_axpy.c written by R. Clint Whaley mu=16, nu=1, using 0.75 percent of L1 Cache Performance = 2341.82 (30.90 of copy matmul, 105.96 of clock) STAGE 3: GENERAL LIBRARY BUILD STAGE 4: POST-BUILD TUNING done. STAGE 4: Threading install ******************************************************************************* ******************************************************************************* ******************************************************************************* * FINISHED ATLAS3.7.30 INSTALL OF SECTION 0-0-0 ON 05/13/2007 AT 17:27 * ******************************************************************************* ******************************************************************************* ******************************************************************************* ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Math-atlas-devel mailing list Math-atlas-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/math-atlas-devel --------------010901060608070204020300-- -- gentoo-science@gentoo.org mailing list