From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from pigeon.gentoo.org ([208.92.234.80] helo=lists.gentoo.org) by finch.gentoo.org with esmtp (Exim 4.60) (envelope-from ) id 1S7Y93-0005D2-CX for garchives@archives.gentoo.org; Tue, 13 Mar 2012 20:16:33 +0000 Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 2CA61E096B; Tue, 13 Mar 2012 20:16:19 +0000 (UTC) Received: from out1-smtp.messagingengine.com (out1-smtp.messagingengine.com [66.111.4.25]) by pigeon.gentoo.org (Postfix) with ESMTP id 674B3E095A for ; Tue, 13 Mar 2012 20:15:21 +0000 (UTC) Received: from compute1.internal (compute1.nyi.mail.srv.osa [10.202.2.41]) by gateway1.nyi.mail.srv.osa (Postfix) with ESMTP id 248A421000 for ; Tue, 13 Mar 2012 16:15:21 -0400 (EDT) Received: from frontend2.nyi.mail.srv.osa ([10.202.2.161]) by compute1.internal (MEProxy); Tue, 13 Mar 2012 16:15:21 -0400 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=binarywings.net; h=message-id:date:from:mime-version:to:subject:references :in-reply-to:content-type; s=mesmtp; bh=/cLpG8l31h2O9frLM+Tu2uXB onQ=; b=VJ5ol5aFs6m+KH2naBTEeJeUQU+TSCQMQqdaYrYnoKPrhx9ylUhjOZ7F sawKF29lpVyPNftrh+GH7gEpyrnBsKxdhRlt2JfzSChM5Mwcw3cubK2crWjSMyCk 1ZVNYpTNqgnsOCXWqJBs6DWSwnG6NBHAE3uUqSxu8Sv4p0qmF9I= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=message-id:date:from:mime-version:to :subject:references:in-reply-to:content-type; s=smtpout; bh=/cLp G8l31h2O9frLM+Tu2uXBonQ=; b=c1Ih2l0MpMr3iFcf0+lWUipDSV8lXDqk9ewa 27VfOJBoPJanebDiN95wbBTiLD6ExijdIYQvzT/zZQTDHeQ3zalC7e6ULJxS4Awk 4ZzyH7XZFKDXsLsebzaOOeoBuZHohWBq0Ps9zUFkQoo1sCcrTYXXCdFox6aO6EVR SVJgFCk= X-Sasl-enc: fQ1bnUcrpUHT1R/Z7XNmn/fLypjDl4wDLVDp2Jwm8lpL 1331669719 Received: from [192.168.5.18] (serv.binarywings.net [83.169.5.6]) by mail.messagingengine.com (Postfix) with ESMTPSA id 257B54824DB for ; Tue, 13 Mar 2012 16:15:18 -0400 (EDT) Message-ID: <4F5FAAD0.2040300@binarywings.net> Date: Tue, 13 Mar 2012 21:15:12 +0100 From: Florian Philipp User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.1) Gecko/20120303 Thunderbird/10.0.1 Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-user@lists.gentoo.org Reply-to: gentoo-user@lists.gentoo.org MIME-Version: 1.0 To: gentoo-user@lists.gentoo.org Subject: Re: [gentoo-user] hard drive encryption References: <4F5CC6F5.6020303@gmail.com> <4F5CEF0D.5050801@binarywings.net> <4F5F35C1.8070301@gmail.com> <4F5F71C3.6070206@binarywings.net> <20120313174555.GA15334@eisen.lan> <4F5F8CC3.7070402@binarywings.net> <835CC491-8ABB-43A6-AE1B-9C3C8C71A178@stellar.eclipse.co.uk> In-Reply-To: X-Enigmail-Version: 1.3.5 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enigB2C85A81EB7B30F5FB112830" X-Archives-Salt: db59efc1-3230-4f89-8afe-7f7d7fc74d40 X-Archives-Hash: 22a0e087f1f027e5d4b3738d9d341206 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigB2C85A81EB7B30F5FB112830 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Am 13.03.2012 20:38, schrieb Michael Mol: > On Tue, Mar 13, 2012 at 3:07 PM, Stroller=20 > wrote: >>=20 >> On 13 March 2012, at 18:18, Michael Mol wrote: >>> ... >>>> So I assume the i586 version is better for you --- unless GCC >>>> suddenly got a lot better at optimizing code. >>>=20 >>> Since when, exactly? GCC isn't the best compiler at optimization, >>> but I fully expect current versions to produce better code for >>> x86-64 than hand-tuned i586. Wider registers, more registers, >>> crypto acceleration instructions and SIMD instructions are all >>> very nice to have. I don't know the specifics of AES, though, or >>> what kind of crypto algorithm it is, so it's entirely possible >>> that one can't effectively parallelize it except in some >>> relatively unique circumstances. >>=20 >> Do you have much experience of writing assembler? >>=20 >> I don't, and I'm not an expert on this, but I've read the odd blog >> article on this subject over the years. >=20 > Similar level of experience here. I can read it, even debug it from=20 > time to time. A few regular bloggers on the subject are like candy.=20 > And I used to have pagetable.org, Ars's Technopaedia and specsheets=20 > for early x86 and motorola processors memorized. For the past couple=20 > years, I've been focusing on reading blogs of language and compiler=20 > authors, academics involved in proofing, testing and improving them,=20 > etc. >=20 >>=20 >> What I've read often has the programmer looking at the compiled gcc >> bytecode and examining what it does. The compiler might not care >> how many registers it uses, and thus a variable might find itself >> frequently swapped back into RAM; the programmer does not have any >> control over the compiler, and IIRC some flags reserve a register >> for degugging (IIRC -fomit-frame-pointer disables this). I think >> it's possible to use registers more efficiently by swapping them >> (??) or by using bitwise comparisons and other tricks. >=20 > Sure; it's cheaper to null out a register by XORing it with itself=20 > than setting it to 0. >=20 >>=20 >> Assembler optimisation is only used on sections of code that are at >> the core of a loop - that are called hundreds or thousands (even >> millions?) of times during the program's execution. It's not for >> code, such as reading the .config file or initialisation, which is >> only called once. Because the code in the core of the loop is >> called so often, you don't have to achieve much of an optimisation >> for the aggregate to be much more considerable. >=20 > Sure; optimize the hell out of the code where you spend most of your=20 > time. I wasn't aware that gcc passed up on safe optimization=20 > opportunities, though. >=20 >>=20 >> The operations in question may only be constitute a few lines of C, >> or a handful of machine operations, so it boils down to an >> algorithm that a human programmer is capable of getting a grip on >> and comprehending. Whilst compilers are clearly more efficient for >> large programs, on this micro scale, humans are more clever and >> creative than machines. >=20 > I disagree. With defined semantics for the source and target, a=20 > computer's cleverness is limited only by the computational and > memory expense of its search algorithms. Humans get through this by > making habit various optimizations, but those habits become less > useful as additional paths and instructions are added. As system > complexity increases, humans operate on personally cached techniques > derived from simpler systems. I would expect very, very few people to > be intimately familiar with the the majority of optimization > possibilities present on an amdfam10 processor or a core2. Compiler's > aren't necessarily familiar with them, either; they're just quicker > at discovering them, given knowledge of the individual instructions > and the rules of language semantics. >=20 >>=20 >> Encryption / decryption is an example of code that lends itself to >> this kind of optimisation. In particular AES was designed, I >> believe, to be amenable to implementation in this way. The reason >> for that was that it was desirable to have it run on embedded >> devices and on dedicated chips. So it boils down to a simple >> bitswap operation (??) - the plaintext is modified by the >> encryption key, input and output as a fast stream. Each byte goes >> in, each byte goes out, the same function performed on each one. >=20 > I'd be willing to posit that you're right here, though if there > isn't a per-byte feedback mechanism, SIMD instructions would come > into serious play. But I expect there's a per-byte feedback > mechanism, so parallelization would likely come in the form of > processing simultaneous streams. >=20 >>=20 >> Another operation that lends itself to assembler optimisation is >> video decoding - the video is encoded only once, and then may be >> played back hundreds or millions of times by different people. The >> same operations must be repeated a number of times on each frame, >> then c 25 - 60 frames are decoded per second, so at least 90,000 >> frames per hour. Again, the smallest optimisation is worthwhile. >=20 > Absolutely. My position, though, is that compilers are quicker and=20 > more capable of discovering optimization possibilities than humans=20 > are, when the target architecture changes. Sure, you've got several=20 > dozen video codecs in, say, ffmpeg, and perhaps it all boils down to=20 > less than a dozen very common cases of inner loop code. With=20 > hand-tuned optimization, you'd need to fork your assembly patch for=20 > each new processor feature that comes out, and then work to find the=20 > most efficient way to execute code on that processor. >=20 > There's also cases where processor features get changed. I don't=20 > remember the name of the instruction (it had something to do with=20 > stack operations) in x86, but Intel switched it from a 0-cycle=20 > instruction to something more expensive. Any code which assumed that=20 > instruction was a 0-cycle instruction now became less efficient. A=20 > compiler (presuming it has a knowledge of the target processor's=20 > instruction set properties) would have an easier time coping with > that change than a human would. >=20 > I'm not saying humans are useless; this is just one of those areas=20 > which is sufficiently complex-yet-deterministic that sufficient=20 > knowledge of the source and target environments would give a > computer the edge over a human in finding the optimal sequence of > CPU instructions. >=20 This thread is becoming ridiculously long. Just as a last side-note: One of the primary reasons that the IA64 architecture failed was that it relied on the compiler to optimize the code in order to exploit the massive instruction-level parallelism the CPU offered. Compilers never became good enough for the job. Of course, that happended in the nineties and we have much better compilers now (and x86 is easier to handle for compilers). But on the other hand: That was Intel's next big thing and if they couldn't make the compilers work, I have no reason to believe in their efficiency now. Regards, Florian Philipp --------------enigB2C85A81EB7B30F5FB112830 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk9fqtQACgkQqs4uOUlOuU+PSgCfS58fz9Yqs0AzUs3vVqeqsoFm Nx8AnjIXBa0t8mAH5MX4cYzUELb1GwYM =A6FC -----END PGP SIGNATURE----- --------------enigB2C85A81EB7B30F5FB112830--