From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) by finch.gentoo.org (Postfix) with ESMTP id 2EBD5138CAE for ; Sat, 2 May 2015 18:35:16 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 72031E089B; Sat, 2 May 2015 18:35:09 +0000 (UTC) Received: from plane.gmane.org (plane.gmane.org [80.91.229.3]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 38BDEE0883 for ; Sat, 2 May 2015 18:35:08 +0000 (UTC) Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1YocFj-0006oh-Pv for gentoo-user@lists.gentoo.org; Sat, 02 May 2015 20:35:04 +0200 Received: from rrcs-71-40-157-251.se.biz.rr.com ([71.40.157.251]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sat, 02 May 2015 20:35:03 +0200 Received: from wireless by rrcs-71-40-157-251.se.biz.rr.com with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sat, 02 May 2015 20:35:03 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: gentoo-user@lists.gentoo.org From: James Subject: [gentoo-user] Re: CFLAGs for kernel compilation Date: Sat, 2 May 2015 18:34:52 +0000 (UTC) Message-ID: References: <5540C101.70906@ramses-pyramidenbau.de> <20150430123819.b72d8b39bd60a912b7c7fde5@gentoo.org> <20150501104402.27d943c901f638942262d3d1@gentoo.org> <5544B2AB.1010700@googlemail.com> <5544B6F2.8040508@googlemail.com> <5544BEB9.9050103@googlemail.com> <55450389.7070900@googlemail.com> Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-user@lists.gentoo.org Reply-to: gentoo-user@lists.gentoo.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: sea.gmane.org User-Agent: Loom/3.14 (http://gmane.org/) X-Loom-IP: 71.40.157.251 (Mozilla/5.0 (X11; Linux x86_64; rv:36.0) Gecko/20100101 Firefox/36.0 SeaMonkey/2.33.1) X-Archives-Salt: a6669a97-f3a9-4735-9514-04cd13e503ce X-Archives-Hash: a7899314042815f3f5bc107d4fd301ad Volker Armin Hemmann googlemail.com> writes: > >>>>>> http://www.agner.org/optimize/calling_conventions.pdf > >>>>> > >>>>> Not sure what you're trying to say. > >>>>> > >>>> > >>>> that simd is not save in kernel if not carefully guarded. > >>>> > >>>> Really people, just don't fuck around with the cflags. > >>> > >>> I still fail to see the relevance. Unless you mean using a different > >>> -O level. In that case, yes. You shouldn't. But I was talking about > >>> -march. > >>> > >> > >> you said this > >> > >>> > >>> (note that SIMD is not FP and is perfectly fine in the kernel.) > >> > >> and I have shown you that you are wrong. > > > > Not sure why you think that. The kernel crypto routines are full of > > SIMD code (like SSE and AVX.) Automatic vectorization wouldn't work. > > But -march is not going to introduce that > > and never used in interrupt context and carefully guarded. You act like > 'oh, you can use simd instructions without any consideration' and that > is just not true. Volker, Historically, you are correct. Looking forward, GCC-5.x will (can?) change this as the simd and other hardware, including (DDR_5) memory all become available for (compiler) usage. For the longest time, we the FOSS communities, have at best been given access to low lever APIs for access to some of these hardware resources. All processor architectures are at war. Intel (the bastards) have had FPGA and tools to reconfigure the amount and types of hardwware in some of their processors for quite some time. The Arm64 cores have simd (GPU if you like) centric cores on the same SOC as the arm64 bit licensed CPU cores. The new gpu has already been integrated into the processor cores (same substrate) just the the i387 FPU was some decades ago. So Arm is providing 'bare metal' access to various customers and compilers Since there are thousands of vendors building up customer arm64 SOCs there is no way for Arm to constrict, like Intel, Nvidia and AMD have historically done. Game_set_match. Even though those GPU cores available via arm64 are very weak compared to Nvidia and AMD; bare metal access to those (gpu) resources if far superior to what Intel (dragging their feet), Nvidia or AMD are offering. Just look at how AMD's Mantle has stalled for the FOSS communities. Amd, via competition from a myriad of arm SOC vendors, is being forced to roll out Arm64 bit server chips, just to stay relevant. Both of you guys are looking at this issue, from historically color-coded sunglasses. Change is here; get onboard with helping the masses help themselves to the feeding (coding) freenzy. What a pair of really smart guys like you (2) should be doing is setting up a gentoo wiki listing and demonstrating for others how to "profile" low level codes: both kernel and system level, so these other gentoo folks *can learn* about what you are saying by example; running tools such as kernelshark, and other performance/profiling types of analysis. Providing seemless and generic access to the gpu resources will go a long way towards revitalizing FOSS cryptographic dominance; and that is a very good thing. ymmv. For the record, most simd hardware really sucks for dense_matrix requirements. Most simd hardware only really works for sparse matrix apps, like x.264 because the overlying (embedded) algorithms used are poorly documented by intention from the hardware vendors. I do not have direct proof; but I strongly suspect this is the case because the simd pipelined memory that these low level APIs give to FOSS community, are memory constricted by design. peace, James