From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id C1D5813832E for ; Fri, 19 Aug 2016 20:53:07 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 76EC3E0B74; Fri, 19 Aug 2016 20:52:56 +0000 (UTC) Received: from omr-a017e.mx.aol.com (omr-a017e.mx.aol.com [204.29.186.68]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 7AA93E0B2B for ; Fri, 19 Aug 2016 20:52:55 +0000 (UTC) Received: from mtaout-aae02.mx.aol.com (mtaout-aae02.mx.aol.com [172.27.1.98]) by omr-a017e.mx.aol.com (Outbound Mail Relay) with ESMTP id 1E6C53800092 for ; Fri, 19 Aug 2016 16:52:54 -0400 (EDT) Received: from [192.168.1.52] (0x5b3139322e3136382e312e35325d [71.122.242.106]) by mtaout-aae02.mx.aol.com (MUA/Third Party Client Interface) with ESMTPA id A91A338000085; Fri, 19 Aug 2016 16:52:53 -0400 (EDT) Subject: Re: [gentoo-dev] New project: LLVM To: gentoo-dev@lists.gentoo.org References: <20160816182204.61c27681.mgorny@gentoo.org> <20160819020737.5419083.98443.119986@pathscale.com> <36efd7a3-ce51-43ed-8aef-e1d6d79a4e5d@gentoo.org> <1b9f33d3-3807-bf55-f021-3ecfb0654c7d@verizon.net> From: james Message-ID: <905c6752-af7d-f19e-2698-b31f2ee5f89d@verizon.net> Date: Fri, 19 Aug 2016 16:52:53 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@lists.gentoo.org Reply-to: gentoo-dev@lists.gentoo.org MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit x-aol-global-disposition: G x-aol-sid: 3039ac1b016257b771a549fd X-AOL-IP: 71.122.242.106 X-Archives-Salt: 172b4177-a52c-4672-8656-42366a4834e8 X-Archives-Hash: 6308985d2bbe9f7d1bc3cf86194a9179 On 08/19/2016 02:20 PM, C Bergström wrote: > Sorry to be the party crasher, but... > > I'd love to have optimizations for everything out there, but it takes > a lot of work to fine tune for something specific. Agreed. Right now on Armv8 alone, there are dozens of teams working on the identical concepts presented in this thread. Most are also targeting specific domains. At some point there with pathways, just like in Computational Chemistry, where the optimization pathway for new silicon is fast and previous work helps tremendously. That is, you are not alone in your quests, far, far from it. > Right now I see a few variants of ARMv8 > ------------ > ARM reference stuff - A57 cores and the newer bits.. The scheduling > and stuff seems more-or-less similar enough that one tuning could > probably work for the vast majority of these parts. > > Cavium ThunderX - It's ground up and quite different from the ARM > reference stuff under the hood > > APM - Mustang, again ground up and different. I don't have enough > hands on to know how different from reference. > > Broadcom - Coming Soon(tm) - Again no hands on or any data, but > certainly very interesting.. > > ... now add in every variant of ground up implementation and you have > 50 shades of gray.. And billions of dollars financing those efforts in parallel. It's an arms race, (like the pun?). Wonder why a Japanese conglomerate offered to purchase ARM ltd. for such a large figure? Wonder why intel has arm licenses now? Your group might only be able to focus on a few ARM offerings, but there are dozens and dozens of ARM teams alone that would dispute your arithmetic above. > ------------- > Soo.. depending on your target hardware, you may be better off with > gcc if the end goal is general all-around performance. (It does a > quite respectable job of being generic) I realize a lot of people have > strong feelings for or against it. I leave that to the reader to > decide.. You misconstrue concepts. Nobody, especially me, implies that one pathway (to a Unikernel [1] if you like) suites all near-optimized solutions. That would be pointless. What you allude to, already exists in some of the more progressive data/cloud vendor clouds. We are talking about a unikernel for different classes of problems, across arm8 and x86-64 and GPU architectures, not thousands of (arch) processor variants. However, those other processor (arch) variants and the folks that earn a living off of those variants, are not sitting back idle, either. > Back to my own glass house.. It will take a few years, but I am trying > to make it easier (internally) to expose in some clear way all the > pieces which compose a fine tuning per-processor. If this was "just" > scheduling models it would be really easy, but it's not.. Those > latencies and other magic bits decide things like.. "should I unroll > this loop or do something else" and then you venture into the land of > accelerators where a custom regalloc may be what you really need and > *nothing* off the shelf fits to meet your goals.. (projects like that > can take 9 months and in the end only give a general 1-5% median > performance gain..) If this is your mantra, I resend the generous comments. Cray use to work that way, milking the Petroleum Industry for tons of money, but, things have changed and the change is accelerating, rapidly. Perhaps too much off those Cray patents that your company owns are leaking toxins into the brain-trust where you park? Vendor walk-back is sad, imho. ymmv. Best of luck to your company's 5-year plan.... [2] http://unikernel.org/ hth, James > -------------- > > > On Sat, Aug 20, 2016 at 2:02 AM, james wrote: >> On 08/19/2016 11:15 AM, C Bergström wrote: >>> >>> On Fri, Aug 19, 2016 at 11:01 PM, Luca Barbato wrote: >>>> >>>> BTW is pathscale ready to be used as system compiler as well? >>> >>> >>> I wish, but no. We have known issues when building grub2, glibc and >>> the Linux kernel at the very least. Someone* did report a long time >>> ago that with their unofficial port, were able to build/boot the >>> NetBSD kernel. >>> (*A community dev we trusted with our sources and was helping us with >>> portability across platforms) >>> >>> The stuff with grub2 may potentially be fixed in the "near" future... >>> the others are more tricky. In general if clang can do it, we have a >>> strong chance as well. >>> >>> As a philosophy - "we" aren't really trying to be the best generic >>> compiler in the world. We aim more on optimizing as much for known >>> targets. So if by system you mean, a compiler that would produce an >>> "OS" which only runs on a single class of hardware, then yeah it could >>> work at some point in the future. Specifically, on x86 we default on >>> host CPU optimizations. So on newer Intel hardware it's easy to get a >>> binary that won't run on AMD or older 64bit Intel. >>> >>> More recently on ARMv8 - we turn on processor specific tuning. So >>> while it may "run", the difference between APM's mustang and Cavium >>> ThunderX is pretty big and running binaries intended for A and ran on >>> B would certainly take a hit.. (this is just the tip of the iceberg) >>> >>> For general scalar OS code it isn't likely to matter... the real >>> impact being like 1-10% difference (being very general.. it could be >>> less or more in the real world..) >>> >>> For HPC codes or anything where you get loops or computationally >>> complex - the gloves are off and I could see big differences... (again >>> being general and maybe a bit dramatic for fun) >> >> >> >> OK (actually fantastic!). Looking at the pathscale site pages and github, >> perhaps a cheap arm embedded board where llvm is the centerpiece of >> compiling a minimal system to entice gentoo-llvm testers, would be possible >> in the near future?. I have a 96boards, HiKey arm64v8 that I could dedicate >> to gentoo+armv8-llvm testing, if that'd help. [1] >> >> Perhaps a baseline bootstrap iso (or such) version targeted at >> llvm-centric testers on x86-64 or armv8 ? Skip grub2 and use grub-legacy or >> lilo or (?), since there seems to be issues with llvm-grub2. >> >> >> [1] http://dev.gentoo.org/~tgall/ >> >> >> No matter how you slice it, from someone who is focused on building >> minimized and embedded (bare metal) systems that are customized and >> coalesced into a heterogeneous gentoo cluster for HPC, this is wonderful >> news. Finally a vendor in the cluster space, with some vision and >> common-sense, imho. Heterogeneous and open HPC is where is at, imho. If >> there is a forum where the community and pathscale folks discuss issues, >> point that out as I could not find one for deeper reading.... >> >> >> hth, >> James >> > >