From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) by finch.gentoo.org (Postfix) with ESMTP id 0E7141389E2 for ; Thu, 11 Dec 2014 22:14:04 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 9EC8CE0EB6; Thu, 11 Dec 2014 22:14:03 +0000 (UTC) Received: from smtp.gentoo.org (smtp.gentoo.org [140.211.166.183]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id D42BBE0EB6 for ; Thu, 11 Dec 2014 22:14:01 +0000 (UTC) Received: from oystercatcher.gentoo.org (oystercatcher.gentoo.org [148.251.78.52]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by smtp.gentoo.org (Postfix) with ESMTPS id 2072133FFB1 for ; Thu, 11 Dec 2014 22:14:00 +0000 (UTC) Received: from localhost.localdomain (localhost [127.0.0.1]) by oystercatcher.gentoo.org (Postfix) with ESMTP id 6AEFAC2BC for ; Thu, 11 Dec 2014 22:13:58 +0000 (UTC) From: "Devan Franchini" To: gentoo-commits@lists.gentoo.org Content-Transfer-Encoding: 8bit Content-type: text/plain; charset=UTF-8 Reply-To: gentoo-dev@lists.gentoo.org, "Devan Franchini" Message-ID: <1418335906.d578f6c9c6f31b1100247f5b7df83e3a4ab0842e.twitch153@gentoo> Subject: [gentoo-commits] proj/releng:master commit in: tools-hardened/desktop/configs/ X-VCS-Repository: proj/releng X-VCS-Files: tools-hardened/desktop/configs/loop-AES-kernel-3.10.patch X-VCS-Directories: tools-hardened/desktop/configs/ X-VCS-Committer: twitch153 X-VCS-Committer-Name: Devan Franchini X-VCS-Revision: d578f6c9c6f31b1100247f5b7df83e3a4ab0842e X-VCS-Branch: master Date: Thu, 11 Dec 2014 22:13:58 +0000 (UTC) Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-commits@lists.gentoo.org X-Archives-Salt: 7873ce3c-7913-4ae9-b70d-e6ce5607ccd5 X-Archives-Hash: 47ac8fdc436667f715108c7d361b6a96 commit: d578f6c9c6f31b1100247f5b7df83e3a4ab0842e Author: Devan Franchini gentoo org> AuthorDate: Thu Dec 11 22:10:12 2014 +0000 Commit: Devan Franchini gentoo org> CommitDate: Thu Dec 11 22:11:46 2014 +0000 URL: http://sources.gentoo.org/gitweb/?p=proj/releng.git;a=commit;h=d578f6c9 tools-hardened/desktop: Removes loop-AES-kernel-3.10.patch --- .../desktop/configs/loop-AES-kernel-3.10.patch | 9046 -------------------- 1 file changed, 9046 deletions(-) diff --git a/tools-hardened/desktop/configs/loop-AES-kernel-3.10.patch b/tools-hardened/desktop/configs/loop-AES-kernel-3.10.patch deleted file mode 100644 index 466987b..0000000 --- a/tools-hardened/desktop/configs/loop-AES-kernel-3.10.patch +++ /dev/null @@ -1,9046 +0,0 @@ -Before this patch can be applied to kernel, drivers/block/loop.c and -include/linux/loop.h source files must be removed: - - rm -f drivers/block/loop.c include/linux/loop.h - -diff -urN linux-3.10-noloop/drivers/block/Kconfig linux-3.10-AES/drivers/block/Kconfig ---- linux-3.10-noloop/drivers/block/Kconfig 2013-07-01 01:13:29.000000000 +0300 -+++ linux-3.10-AES/drivers/block/Kconfig 2013-07-01 16:12:48.000000000 +0300 -@@ -230,14 +230,6 @@ - bits of, say, a sound file). This is also safe if the file resides - on a remote file server. - -- There are several ways of encrypting disks. Some of these require -- kernel patches. The vanilla kernel offers the cryptoloop option -- and a Device Mapper target (which is superior, as it supports all -- file systems). If you want to use the cryptoloop, say Y to both -- LOOP and CRYPTOLOOP, and make sure you have a recent (version 2.12 -- or later) version of util-linux. Additionally, be aware that -- the cryptoloop is not safe for storing journaled filesystems. -- - Note that this loop device has nothing to do with the loopback - device used for network connections from the machine to itself. - -@@ -246,35 +238,40 @@ - - Most users will answer N here. - --config BLK_DEV_LOOP_MIN_COUNT -- int "Number of loop devices to pre-create at init time" -+config BLK_DEV_LOOP_AES -+ bool "AES encrypted loop device support" - depends on BLK_DEV_LOOP -- default 8 -- help -- Static number of loop devices to be unconditionally pre-created -- at init time. -- -- This default value can be overwritten on the kernel command -- line or with module-parameter loop.max_loop. -- -- The historic default is 8. If a late 2011 version of losetup(8) -- is used, it can be set to 0, since needed loop devices can be -- dynamically allocated with the /dev/loop-control interface. -- --config BLK_DEV_CRYPTOLOOP -- tristate "Cryptoloop Support" -- select CRYPTO -- select CRYPTO_CBC -+ ---help--- -+ If you want to use AES encryption algorithm to encrypt loop -+ devices, say Y here. If you don't know what to do here, say N. -+ -+config BLK_DEV_LOOP_KEYSCRUB -+ bool "loop encryption key scrubbing support" - depends on BLK_DEV_LOOP - ---help--- -- Say Y here if you want to be able to use the ciphers that are -- provided by the CryptoAPI as loop transformation. This might be -- used as hard disk encryption. -- -- WARNING: This device is not safe for journaled file systems like -- ext3 or Reiserfs. Please use the Device Mapper crypto module -- instead, which can be configured to be on-disk compatible with the -- cryptoloop device. -+ Loop encryption key scrubbing moves and inverts key bits in -+ kernel RAM so that the thin oxide which forms the storage -+ capacitor dielectric of DRAM cells is not permitted to develop -+ detectable property. For more info, see Peter Gutmann's paper: -+ http://www.cypherpunks.to/~peter/usenix01.pdf -+ -+ Paranoid tinfoil hat crowd say Y here, everyone else say N. -+ -+config BLK_DEV_LOOP_PADLOCK -+ bool "VIA padlock hardware AES support" -+ depends on BLK_DEV_LOOP && BLK_DEV_LOOP_AES && (X86 || X86_64) -+ ---help--- -+ If you have VIA processor that supports padlock xcrypt instructions, -+ say Y here. If enabled, presence of VIA padlock instructions is detected -+ at run time, but code still works on non-padlock processors too. -+ -+config BLK_DEV_LOOP_INTELAES -+ bool "Intel hardware AES support" -+ depends on BLK_DEV_LOOP && BLK_DEV_LOOP_AES && (X86 || X86_64) -+ ---help--- -+ If you have a processor that supports Intel AES instructions, -+ say Y here. If enabled, presence of Intel AES instructions is detected -+ at run time, but code still works on older processors too. - - source "drivers/block/drbd/Kconfig" - -diff -urN linux-3.10-noloop/drivers/block/loop.c linux-3.10-AES/drivers/block/loop.c ---- linux-3.10-noloop/drivers/block/loop.c 1970-01-01 02:00:00.000000000 +0200 -+++ linux-3.10-AES/drivers/block/loop.c 2013-07-01 16:13:16.000000000 +0300 -@@ -0,0 +1,3196 @@ -+/* -+ * linux/drivers/block/loop.c -+ * -+ * Written by Theodore Ts'o, 3/29/93 -+ * -+ * Copyright 1993 by Theodore Ts'o. Redistribution of this file is -+ * permitted under the GNU General Public License. -+ * -+ * DES encryption plus some minor changes by Werner Almesberger, 30-MAY-1993 -+ * more DES encryption plus IDEA encryption by Nicholas J. Leon, June 20, 1996 -+ * -+ * Modularized and updated for 1.1.16 kernel - Mitch Dsouza 28th May 1994 -+ * Adapted for 1.3.59 kernel - Andries Brouwer, 1 Feb 1996 -+ * -+ * Fixed do_loop_request() re-entrancy - Vincent.Renardias@waw.com Mar 20, 1997 -+ * -+ * Added devfs support - Richard Gooch 16-Jan-1998 -+ * -+ * Handle sparse backing files correctly - Kenn Humborg, Jun 28, 1998 -+ * -+ * Loadable modules and other fixes by AK, 1998 -+ * -+ * Make real block number available to downstream transfer functions, enables -+ * CBC (and relatives) mode encryption requiring unique IVs per data block. -+ * Reed H. Petty, rhp@draper.net -+ * -+ * Maximum number of loop devices now dynamic via max_loop module parameter. -+ * Russell Kroll 19990701 -+ * -+ * Maximum number of loop devices when compiled-in now selectable by passing -+ * max_loop=<1-255> to the kernel on boot. -+ * Erik I. Bolsų, , Oct 31, 1999 -+ * -+ * Completely rewrite request handling to be make_request_fn style and -+ * non blocking, pushing work to a helper thread. Lots of fixes from -+ * Al Viro too. -+ * Jens Axboe , Nov 2000 -+ * -+ * Support up to 256 loop devices -+ * Heinz Mauelshagen , Feb 2002 -+ * -+ * AES transfer added. IV is now passed as (512 byte) sector number. -+ * Jari Ruusu, May 18 2001 -+ * -+ * External encryption module locking bug fixed. -+ * Ingo Rohloff , June 21 2001 -+ * -+ * Make device backed loop work with swap (pre-allocated buffers + queue rewrite). -+ * Jari Ruusu, September 2 2001 -+ * -+ * Ported 'pre-allocated buffers + queue rewrite' to BIO for 2.5 kernels -+ * Ben Slusky , March 1 2002 -+ * Jari Ruusu, March 27 2002 -+ * -+ * File backed code now uses file->f_op->read/write. Based on Andrew Morton's idea. -+ * Jari Ruusu, May 23 2002 -+ * -+ * Exported hard sector size correctly, fixed file-backed-loop-on-tmpfs bug, -+ * plus many more enhancements and optimizations. -+ * Adam J. Richter , Aug 2002 -+ * -+ * Added support for removing offset from IV computations. -+ * Jari Ruusu, September 21 2003 -+ * -+ * Added support for MD5 IV computation and multi-key operation. -+ * Jari Ruusu, October 8 2003 -+ * -+ * -+ * Still To Fix: -+ * - Advisory locking is ignored here. -+ * - Should use an own CAP_* category instead of CAP_SYS_ADMIN -+ */ -+ -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+#ifdef CONFIG_DEVFS_FS -+# include -+#endif -+#include -+#include -+#include -+#include -+#include -+#include /* for invalidate_bdev() */ -+#include -+#include -+#if defined(CONFIG_COMPAT) && defined(HAVE_COMPAT_IOCTL) -+# include -+#endif -+#include -+#include -+#include -+#include -+ -+#include -+#include -+#if (defined(CONFIG_BLK_DEV_LOOP_PADLOCK) || defined(CONFIG_BLK_DEV_LOOP_INTELAES)) && (defined(CONFIG_X86) || defined(CONFIG_X86_64)) -+# include -+#endif -+#if defined(CONFIG_BLK_DEV_LOOP_INTELAES) && (defined(CONFIG_X86) || defined(CONFIG_X86_64)) -+# include -+#endif -+ -+#if defined(CONFIG_X86) && !defined(CONFIG_X86_64) -+# define X86_ASM 1 -+#endif -+#if defined(CONFIG_X86_64) -+# define AMD64_ASM 1 -+#endif -+ -+#include "../misc/aes.h" -+#include "../misc/md5.h" -+ -+#if defined(CONFIG_COMPAT) && !defined(HAVE_COMPAT_IOCTL) -+# include -+# define IOCTL32_COMPATIBLE_PTR ((void*)0) -+#endif -+ -+//#define LOOP_HAVE_CONGESTED_FN 1 -+ -+#define L_BIO_RW_AHEAD (REQ_RAHEAD) -+#define L_BIO_RW_NOIDLE (REQ_NOIDLE) -+#define L_BIO_RW_SYNCIO (REQ_SYNC) -+ -+static int max_loop = 8; -+ -+#ifdef MODULE -+module_param(max_loop, int, 0); -+MODULE_PARM_DESC(max_loop, "Maximum number of loop devices (1-256)"); -+#else -+static int __init max_loop_setup(char *str) -+{ -+ int y; -+ -+ if (get_option(&str, &y) == 1) -+ max_loop = y; -+ return 1; -+} -+__setup("max_loop=", max_loop_setup); -+#endif -+ -+static struct gendisk **disks; -+ -+/* -+ * Transfer functions -+ */ -+static int transfer_none(struct loop_device *lo, int cmd, char *raw_buf, -+ char *loop_buf, int size, sector_t real_block) -+{ -+ /* this code is only called from file backed loop */ -+ /* and that code expects this function to be no-op */ -+ -+ cond_resched(); -+ return 0; -+} -+ -+static int transfer_xor(struct loop_device *lo, int cmd, char *raw_buf, -+ char *loop_buf, int size, sector_t real_block) -+{ -+ char *in, *out, *key; -+ int i, keysize; -+ -+ if (cmd == READ) { -+ in = raw_buf; -+ out = loop_buf; -+ } else { -+ in = loop_buf; -+ out = raw_buf; -+ } -+ -+ key = lo->lo_encrypt_key; -+ keysize = lo->lo_encrypt_key_size; -+ for (i = 0; i < size; i++) -+ *out++ = *in++ ^ key[(i & 511) % keysize]; -+ cond_resched(); -+ return 0; -+} -+ -+static int xor_init(struct loop_device *lo, struct loop_info64 *info) -+{ -+ if (info->lo_encrypt_key_size <= 0) -+ return -EINVAL; -+ return 0; -+} -+ -+static struct loop_func_table none_funcs = { -+ .number = LO_CRYPT_NONE, -+ .transfer = transfer_none, -+}; -+ -+static struct loop_func_table xor_funcs = { -+ .number = LO_CRYPT_XOR, -+ .transfer = transfer_xor, -+ .init = xor_init, -+}; -+ -+#ifdef CONFIG_BLK_DEV_LOOP_AES -+#ifdef CONFIG_BLK_DEV_LOOP_KEYSCRUB -+# define KEY_ALLOC_COUNT 128 -+#else -+# define KEY_ALLOC_COUNT 64 -+#endif -+ -+typedef struct { -+ aes_context *keyPtr[KEY_ALLOC_COUNT]; -+ unsigned keyMask; -+#ifdef CONFIG_BLK_DEV_LOOP_KEYSCRUB -+ u_int32_t *partialMD5; -+ u_int32_t partialMD5buf[8]; -+ rwlock_t rwlock; -+ unsigned reversed; -+ unsigned blocked; -+ struct timer_list timer; -+#else -+ u_int32_t partialMD5[4]; -+#endif -+#if defined(CONFIG_BLK_DEV_LOOP_PADLOCK) && (defined(CONFIG_X86) || defined(CONFIG_X86_64)) -+ u_int32_t padlock_cw_e; -+ u_int32_t padlock_cw_d; -+#endif -+} AESmultiKey; -+ -+#if (defined(CONFIG_BLK_DEV_LOOP_PADLOCK) || defined(CONFIG_BLK_DEV_LOOP_INTELAES)) && (defined(CONFIG_X86) || defined(CONFIG_X86_64)) -+/* This function allocates AES context structures at special address such */ -+/* that returned address % 16 == 8 . That way expanded encryption and */ -+/* decryption keys in AES context structure are always 16 byte aligned */ -+static void *specialAligned_kmalloc(size_t size, unsigned int flags) -+{ -+ void *pn, **ps; -+ pn = kmalloc(size + (16 + 8), flags); -+ if(!pn) return (void *)0; -+ ps = (void **)((((unsigned long)pn + 15) & ~((unsigned long)15)) + 8); -+ *(ps - 1) = pn; -+ return (void *)ps; -+} -+static void specialAligned_kfree(void *ps) -+{ -+ if(ps) kfree(*((void **)ps - 1)); -+} -+# define specialAligned_ctxSize ((sizeof(aes_context) + 15) & ~15) -+#else -+# define specialAligned_kmalloc kmalloc -+# define specialAligned_kfree kfree -+# define specialAligned_ctxSize sizeof(aes_context) -+#endif -+ -+#ifdef CONFIG_BLK_DEV_LOOP_KEYSCRUB -+static void keyScrubWork(AESmultiKey *m) -+{ -+ aes_context *a0, *a1; -+ u_int32_t *p; -+ int x, y, z; -+ -+ z = m->keyMask + 1; -+ for(x = 0; x < z; x++) { -+ a0 = m->keyPtr[x]; -+ a1 = m->keyPtr[x + z]; -+ memcpy(a1, a0, sizeof(aes_context)); -+ m->keyPtr[x] = a1; -+ m->keyPtr[x + z] = a0; -+ p = (u_int32_t *) a0; -+ y = sizeof(aes_context) / sizeof(u_int32_t); -+ while(y > 0) { -+ *p ^= 0xFFFFFFFF; -+ p++; -+ y--; -+ } -+ } -+ -+ x = m->reversed; /* x is 0 or 4 */ -+ m->reversed ^= 4; -+ y = m->reversed; /* y is 4 or 0 */ -+ p = &m->partialMD5buf[x]; -+ memcpy(&m->partialMD5buf[y], p, 16); -+ m->partialMD5 = &m->partialMD5buf[y]; -+ p[0] ^= 0xFFFFFFFF; -+ p[1] ^= 0xFFFFFFFF; -+ p[2] ^= 0xFFFFFFFF; -+ p[3] ^= 0xFFFFFFFF; -+ -+ /* try to flush dirty cache data to RAM */ -+#if !defined(CONFIG_XEN) && (defined(CONFIG_X86_64) || (defined(CONFIG_X86) && !defined(CONFIG_M386) && !defined(CONFIG_CPU_386))) -+ __asm__ __volatile__ ("wbinvd": : :"memory"); -+#else -+ mb(); -+#endif -+} -+ -+/* called only from loop thread process context */ -+static void keyScrubThreadFn(AESmultiKey *m) -+{ -+ write_lock(&m->rwlock); -+ if(!m->blocked) keyScrubWork(m); -+ write_unlock(&m->rwlock); -+} -+ -+#if defined(NEW_TIMER_VOID_PTR_PARAM) -+# define KeyScrubTimerFnParamType void * -+#else -+# define KeyScrubTimerFnParamType unsigned long -+#endif -+ -+static void keyScrubTimerFn(KeyScrubTimerFnParamType); -+ -+static void keyScrubTimerInit(struct loop_device *lo) -+{ -+ AESmultiKey *m; -+ unsigned long expire; -+ -+ m = (AESmultiKey *)lo->key_data; -+ expire = jiffies + HZ; -+ init_timer(&m->timer); -+ m->timer.expires = expire; -+ m->timer.data = (KeyScrubTimerFnParamType)lo; -+ m->timer.function = keyScrubTimerFn; -+ add_timer(&m->timer); -+} -+ -+/* called only from timer handler context */ -+static void keyScrubTimerFn(KeyScrubTimerFnParamType d) -+{ -+ struct loop_device *lo = (struct loop_device *)d; -+ extern void loop_add_keyscrub_fn(struct loop_device *, void (*)(void *), void *); -+ -+ /* rw lock needs process context, so make loop thread do scrubbing */ -+ loop_add_keyscrub_fn(lo, (void (*)(void*))keyScrubThreadFn, lo->key_data); -+ /* start timer again */ -+ keyScrubTimerInit(lo); -+} -+#endif -+ -+static AESmultiKey *allocMultiKey(void) -+{ -+ AESmultiKey *m; -+ aes_context *a; -+ int x = 0, n; -+ -+ m = (AESmultiKey *) kmalloc(sizeof(AESmultiKey), GFP_KERNEL); -+ if(!m) return 0; -+ memset(m, 0, sizeof(AESmultiKey)); -+#ifdef CONFIG_BLK_DEV_LOOP_KEYSCRUB -+ m->partialMD5 = &m->partialMD5buf[0]; -+ rwlock_init(&m->rwlock); -+ init_timer(&m->timer); -+ again: -+#endif -+ -+ n = PAGE_SIZE / specialAligned_ctxSize; -+ if(!n) n = 1; -+ -+ a = (aes_context *) specialAligned_kmalloc(specialAligned_ctxSize * n, GFP_KERNEL); -+ if(!a) { -+#ifdef CONFIG_BLK_DEV_LOOP_KEYSCRUB -+ if(x) specialAligned_kfree(m->keyPtr[0]); -+#endif -+ kfree(m); -+ return 0; -+ } -+ -+ while((x < KEY_ALLOC_COUNT) && n) { -+ m->keyPtr[x] = a; -+ a = (aes_context *)((unsigned char *)a + specialAligned_ctxSize); -+ x++; -+ n--; -+ } -+#ifdef CONFIG_BLK_DEV_LOOP_KEYSCRUB -+ if(x < 2) goto again; -+#endif -+ return m; -+} -+ -+static void clearAndFreeMultiKey(AESmultiKey *m) -+{ -+ aes_context *a; -+ int x, n; -+ -+#ifdef CONFIG_BLK_DEV_LOOP_KEYSCRUB -+ /* stop scrub timer. loop thread was killed earlier */ -+ del_timer_sync(&m->timer); -+ /* make sure allocated keys are in original order */ -+ if(m->reversed) keyScrubWork(m); -+#endif -+ n = PAGE_SIZE / specialAligned_ctxSize; -+ if(!n) n = 1; -+ -+ x = 0; -+ while(x < KEY_ALLOC_COUNT) { -+ a = m->keyPtr[x]; -+ if(!a) break; -+ memset(a, 0, specialAligned_ctxSize * n); -+ specialAligned_kfree(a); -+ x += n; -+ } -+ -+ memset(m, 0, sizeof(AESmultiKey)); -+ kfree(m); -+} -+ -+static int multiKeySetup(struct loop_device *lo, unsigned char *k, int version3) -+{ -+ AESmultiKey *m; -+ aes_context *a; -+ int x, y, n, err = 0; -+ union { -+ u_int32_t w[16]; -+ unsigned char b[64]; -+ } un; -+ -+#if LINUX_VERSION_CODE >= 0x30600 -+ if(!uid_eq(lo->lo_key_owner, current_uid()) && !capable(CAP_SYS_ADMIN)) -+ return -EPERM; -+#elif LINUX_VERSION_CODE >= 0x2061c -+ if(lo->lo_key_owner != current_uid() && !capable(CAP_SYS_ADMIN)) -+ return -EPERM; -+#else -+ if(lo->lo_key_owner != current->uid && !capable(CAP_SYS_ADMIN)) -+ return -EPERM; -+#endif -+ -+ m = (AESmultiKey *)lo->key_data; -+ if(!m) return -ENXIO; -+ -+#ifdef CONFIG_BLK_DEV_LOOP_KEYSCRUB -+ /* temporarily prevent loop thread from messing with keys */ -+ write_lock(&m->rwlock); -+ m->blocked = 1; -+ /* make sure allocated keys are in original order */ -+ if(m->reversed) keyScrubWork(m); -+ write_unlock(&m->rwlock); -+#endif -+ n = PAGE_SIZE / specialAligned_ctxSize; -+ if(!n) n = 1; -+ -+ x = 0; -+ while(x < KEY_ALLOC_COUNT) { -+ if(!m->keyPtr[x]) { -+ a = (aes_context *) specialAligned_kmalloc(specialAligned_ctxSize * n, GFP_KERNEL); -+ if(!a) { -+ err = -ENOMEM; -+ goto error_out; -+ } -+ y = x; -+ while((y < (x + n)) && (y < KEY_ALLOC_COUNT)) { -+ m->keyPtr[y] = a; -+ a = (aes_context *)((unsigned char *)a + specialAligned_ctxSize); -+ y++; -+ } -+ } -+#ifdef CONFIG_BLK_DEV_LOOP_KEYSCRUB -+ if(x >= 64) { -+ x++; -+ continue; -+ } -+#endif -+ if(copy_from_user(&un.b[0], k, 32)) { -+ err = -EFAULT; -+ goto error_out; -+ } -+ aes_set_key(m->keyPtr[x], &un.b[0], lo->lo_encrypt_key_size, 0); -+ k += 32; -+ x++; -+ } -+ -+ m->partialMD5[0] = 0x67452301; -+ m->partialMD5[1] = 0xefcdab89; -+ m->partialMD5[2] = 0x98badcfe; -+ m->partialMD5[3] = 0x10325476; -+ if(version3) { -+ /* only first 128 bits of iv-key is used */ -+ if(copy_from_user(&un.b[0], k, 16)) { -+ err = -EFAULT; -+ goto error_out; -+ } -+#if defined(__BIG_ENDIAN) -+ un.w[0] = cpu_to_le32(un.w[0]); -+ un.w[1] = cpu_to_le32(un.w[1]); -+ un.w[2] = cpu_to_le32(un.w[2]); -+ un.w[3] = cpu_to_le32(un.w[3]); -+#endif -+ memset(&un.b[16], 0, 48); -+ md5_transform_CPUbyteorder(&m->partialMD5[0], &un.w[0]); -+ lo->lo_flags |= 0x080000; /* multi-key-v3 (info exported to user space) */ -+ } -+ -+ m->keyMask = 0x3F; /* range 0...63 */ -+ lo->lo_flags |= 0x100000; /* multi-key (info exported to user space) */ -+ memset(&un.b[0], 0, 32); -+error_out: -+#ifdef CONFIG_BLK_DEV_LOOP_KEYSCRUB -+ /* re-enable loop thread key scrubbing */ -+ write_lock(&m->rwlock); -+ m->blocked = 0; -+ write_unlock(&m->rwlock); -+#endif -+ return err; -+} -+ -+static int keySetup_aes(struct loop_device *lo, struct loop_info64 *info) -+{ -+ AESmultiKey *m; -+ union { -+ u_int32_t w[8]; /* needed for 4 byte alignment for b[] */ -+ unsigned char b[32]; -+ } un; -+ -+ lo->key_data = m = allocMultiKey(); -+ if(!m) return(-ENOMEM); -+ memcpy(&un.b[0], &info->lo_encrypt_key[0], 32); -+ aes_set_key(m->keyPtr[0], &un.b[0], info->lo_encrypt_key_size, 0); -+ memset(&info->lo_encrypt_key[0], 0, sizeof(info->lo_encrypt_key)); -+ memset(&un.b[0], 0, 32); -+#if defined(CONFIG_BLK_DEV_LOOP_PADLOCK) && (defined(CONFIG_X86) || defined(CONFIG_X86_64)) -+ switch(info->lo_encrypt_key_size) { -+ case 256: /* bits */ -+ case 32: /* bytes */ -+ /* 14 rounds, AES, software key gen, normal oper, encrypt, 256-bit key */ -+ m->padlock_cw_e = 14 | (1<<7) | (2<<10); -+ /* 14 rounds, AES, software key gen, normal oper, decrypt, 256-bit key */ -+ m->padlock_cw_d = 14 | (1<<7) | (1<<9) | (2<<10); -+ break; -+ case 192: /* bits */ -+ case 24: /* bytes */ -+ /* 12 rounds, AES, software key gen, normal oper, encrypt, 192-bit key */ -+ m->padlock_cw_e = 12 | (1<<7) | (1<<10); -+ /* 12 rounds, AES, software key gen, normal oper, decrypt, 192-bit key */ -+ m->padlock_cw_d = 12 | (1<<7) | (1<<9) | (1<<10); -+ break; -+ default: -+ /* 10 rounds, AES, software key gen, normal oper, encrypt, 128-bit key */ -+ m->padlock_cw_e = 10 | (1<<7); -+ /* 10 rounds, AES, software key gen, normal oper, decrypt, 128-bit key */ -+ m->padlock_cw_d = 10 | (1<<7) | (1<<9); -+ break; -+ } -+#endif -+#ifdef CONFIG_BLK_DEV_LOOP_KEYSCRUB -+ keyScrubTimerInit(lo); -+#endif -+ return(0); -+} -+ -+static int keyClean_aes(struct loop_device *lo) -+{ -+ if(lo->key_data) { -+ clearAndFreeMultiKey((AESmultiKey *)lo->key_data); -+ lo->key_data = 0; -+ } -+ return(0); -+} -+ -+static int handleIoctl_aes(struct loop_device *lo, int cmd, unsigned long arg) -+{ -+ int err; -+ -+ switch (cmd) { -+ case LOOP_MULTI_KEY_SETUP: -+ err = multiKeySetup(lo, (unsigned char *)arg, 0); -+ break; -+ case LOOP_MULTI_KEY_SETUP_V3: -+ err = multiKeySetup(lo, (unsigned char *)arg, 1); -+ break; -+ default: -+ err = -EINVAL; -+ } -+ return err; -+} -+ -+void loop_compute_sector_iv(sector_t devSect, u_int32_t *ivout) -+{ -+ if(sizeof(sector_t) == 8) { -+ ivout[0] = cpu_to_le32(devSect); -+ ivout[1] = cpu_to_le32((u_int64_t)devSect>>32); -+ ivout[3] = ivout[2] = 0; -+ } else { -+ ivout[0] = cpu_to_le32(devSect); -+ ivout[3] = ivout[2] = ivout[1] = 0; -+ } -+} -+ -+void loop_compute_md5_iv_v3(sector_t devSect, u_int32_t *ivout, u_int32_t *data) -+{ -+ int x; -+#if defined(__BIG_ENDIAN) -+ int y, e; -+#endif -+ u_int32_t buf[16]; -+ -+#if defined(__BIG_ENDIAN) -+ y = 7; -+ e = 16; -+ do { -+ if (!y) { -+ e = 12; -+ /* md5_transform_CPUbyteorder wants data in CPU byte order */ -+ /* devSect is already in CPU byte order -- no need to convert */ -+ if(sizeof(sector_t) == 8) { -+ /* use only 56 bits of sector number */ -+ buf[12] = devSect; -+ buf[13] = (((u_int64_t)devSect >> 32) & 0xFFFFFF) | 0x80000000; -+ } else { -+ /* 32 bits of sector number + 24 zero bits */ -+ buf[12] = devSect; -+ buf[13] = 0x80000000; -+ } -+ /* 4024 bits == 31 * 128 bit plaintext blocks + 56 bits of sector number */ -+ /* For version 3 on-disk format this really should be 4536 bits, but can't be */ -+ /* changed without breaking compatibility. V3 uses MD5-with-wrong-length IV */ -+ buf[14] = 4024; -+ buf[15] = 0; -+ } -+ x = 0; -+ do { -+ buf[x ] = cpu_to_le32(data[0]); -+ buf[x + 1] = cpu_to_le32(data[1]); -+ buf[x + 2] = cpu_to_le32(data[2]); -+ buf[x + 3] = cpu_to_le32(data[3]); -+ x += 4; -+ data += 4; -+ } while (x < e); -+ md5_transform_CPUbyteorder(&ivout[0], &buf[0]); -+ } while (--y >= 0); -+ ivout[0] = cpu_to_le32(ivout[0]); -+ ivout[1] = cpu_to_le32(ivout[1]); -+ ivout[2] = cpu_to_le32(ivout[2]); -+ ivout[3] = cpu_to_le32(ivout[3]); -+#else -+ x = 6; -+ do { -+ md5_transform_CPUbyteorder(&ivout[0], data); -+ data += 16; -+ } while (--x >= 0); -+ memcpy(buf, data, 48); -+ /* md5_transform_CPUbyteorder wants data in CPU byte order */ -+ /* devSect is already in CPU byte order -- no need to convert */ -+ if(sizeof(sector_t) == 8) { -+ /* use only 56 bits of sector number */ -+ buf[12] = devSect; -+ buf[13] = (((u_int64_t)devSect >> 32) & 0xFFFFFF) | 0x80000000; -+ } else { -+ /* 32 bits of sector number + 24 zero bits */ -+ buf[12] = devSect; -+ buf[13] = 0x80000000; -+ } -+ /* 4024 bits == 31 * 128 bit plaintext blocks + 56 bits of sector number */ -+ /* For version 3 on-disk format this really should be 4536 bits, but can't be */ -+ /* changed without breaking compatibility. V3 uses MD5-with-wrong-length IV */ -+ buf[14] = 4024; -+ buf[15] = 0; -+ md5_transform_CPUbyteorder(&ivout[0], &buf[0]); -+#endif -+} -+ -+/* this function exists for compatibility with old external cipher modules */ -+void loop_compute_md5_iv(sector_t devSect, u_int32_t *ivout, u_int32_t *data) -+{ -+ ivout[0] = 0x67452301; -+ ivout[1] = 0xefcdab89; -+ ivout[2] = 0x98badcfe; -+ ivout[3] = 0x10325476; -+ loop_compute_md5_iv_v3(devSect, ivout, data); -+} -+ -+/* Some external modules do not know if md5_transform_CPUbyteorder() */ -+/* is asmlinkage or not, so here is C language wrapper for them. */ -+void md5_transform_CPUbyteorder_C(u_int32_t *hash, u_int32_t const *in) -+{ -+ md5_transform_CPUbyteorder(hash, in); -+} -+ -+#if defined(CONFIG_X86_64) && defined(AMD64_ASM) -+# define HAVE_MD5_2X_IMPLEMENTATION 1 -+#endif -+#if defined(HAVE_MD5_2X_IMPLEMENTATION) -+/* -+ * This 2x code is currently only available on little endian AMD64 -+ * This 2x code assumes little endian byte order -+ * Context A input data is at zero offset, context B at data + 512 bytes -+ * Context A ivout at zero offset, context B at ivout + 16 bytes -+ */ -+void loop_compute_md5_iv_v3_2x(sector_t devSect, u_int32_t *ivout, u_int32_t *data) -+{ -+ int x; -+ u_int32_t buf[2*16]; -+ -+ x = 6; -+ do { -+ md5_transform_CPUbyteorder_2x(&ivout[0], data, data + (512/4)); -+ data += 16; -+ } while (--x >= 0); -+ memcpy(&buf[0], data, 48); -+ memcpy(&buf[16], data + (512/4), 48); -+ /* md5_transform_CPUbyteorder wants data in CPU byte order */ -+ /* devSect is already in CPU byte order -- no need to convert */ -+ if(sizeof(sector_t) == 8) { -+ /* use only 56 bits of sector number */ -+ buf[12] = devSect; -+ buf[13] = (((u_int64_t)devSect >> 32) & 0xFFFFFF) | 0x80000000; -+ buf[16 + 12] = ++devSect; -+ buf[16 + 13] = (((u_int64_t)devSect >> 32) & 0xFFFFFF) | 0x80000000; -+ } else { -+ /* 32 bits of sector number + 24 zero bits */ -+ buf[12] = devSect; -+ buf[16 + 13] = buf[13] = 0x80000000; -+ buf[16 + 12] = ++devSect; -+ } -+ /* 4024 bits == 31 * 128 bit plaintext blocks + 56 bits of sector number */ -+ /* For version 3 on-disk format this really should be 4536 bits, but can't be */ -+ /* changed without breaking compatibility. V3 uses MD5-with-wrong-length IV */ -+ buf[16 + 14] = buf[14] = 4024; -+ buf[16 + 15] = buf[15] = 0; -+ md5_transform_CPUbyteorder_2x(&ivout[0], &buf[0], &buf[16]); -+} -+#endif /* defined(HAVE_MD5_2X_IMPLEMENTATION) */ -+ -+/* -+ * Special requirements for transfer functions: -+ * (1) Plaintext data (loop_buf) may change while it is being read. -+ * (2) On 2.2 and older kernels ciphertext buffer (raw_buf) may be doing -+ * writes to disk at any time, so it can't be used as temporary buffer. -+ */ -+static int transfer_aes(struct loop_device *lo, int cmd, char *raw_buf, -+ char *loop_buf, int size, sector_t devSect) -+{ -+ aes_context *a; -+ AESmultiKey *m; -+ int x; -+ unsigned y; -+ u_int64_t iv[4], *dip; -+ -+ if(!size || (size & 511)) { -+ return -EINVAL; -+ } -+ m = (AESmultiKey *)lo->key_data; -+ y = m->keyMask; -+#ifdef CONFIG_BLK_DEV_LOOP_KEYSCRUB -+ read_lock(&m->rwlock); -+#endif -+ if(cmd == READ) { -+#if defined(HAVE_MD5_2X_IMPLEMENTATION) -+ /* if possible, use faster 2x MD5 implementation, currently AMD64 only (#6) */ -+ while((size >= (2*512)) && y) { -+ /* multi-key mode, decrypt 2 sectors at a time */ -+ a = m->keyPtr[((unsigned)devSect ) & y]; -+ /* decrypt using fake all-zero IV, first sector */ -+ memset(iv, 0, 16); -+ x = 15; -+ do { -+ memcpy(&iv[2], raw_buf, 16); -+ aes_decrypt(a, raw_buf, loop_buf); -+ *((u_int64_t *)(&loop_buf[0])) ^= iv[0]; -+ *((u_int64_t *)(&loop_buf[8])) ^= iv[1]; -+ raw_buf += 16; -+ loop_buf += 16; -+ memcpy(iv, raw_buf, 16); -+ aes_decrypt(a, raw_buf, loop_buf); -+ *((u_int64_t *)(&loop_buf[0])) ^= iv[2]; -+ *((u_int64_t *)(&loop_buf[8])) ^= iv[3]; -+ raw_buf += 16; -+ loop_buf += 16; -+ } while(--x >= 0); -+ a = m->keyPtr[((unsigned)devSect + 1) & y]; -+ /* decrypt using fake all-zero IV, second sector */ -+ memset(iv, 0, 16); -+ x = 15; -+ do { -+ memcpy(&iv[2], raw_buf, 16); -+ aes_decrypt(a, raw_buf, loop_buf); -+ *((u_int64_t *)(&loop_buf[0])) ^= iv[0]; -+ *((u_int64_t *)(&loop_buf[8])) ^= iv[1]; -+ raw_buf += 16; -+ loop_buf += 16; -+ memcpy(iv, raw_buf, 16); -+ aes_decrypt(a, raw_buf, loop_buf); -+ *((u_int64_t *)(&loop_buf[0])) ^= iv[2]; -+ *((u_int64_t *)(&loop_buf[8])) ^= iv[3]; -+ raw_buf += 16; -+ loop_buf += 16; -+ } while(--x >= 0); -+ /* compute correct IV */ -+ memcpy(&iv[0], &m->partialMD5[0], 16); -+ memcpy(&iv[2], &m->partialMD5[0], 16); -+ loop_compute_md5_iv_v3_2x(devSect, (u_int32_t *)iv, (u_int32_t *)(loop_buf - 1008)); -+ /* XOR with correct IV now */ -+ *((u_int64_t *)(loop_buf - 1024)) ^= iv[0]; -+ *((u_int64_t *)(loop_buf - 1016)) ^= iv[1]; -+ *((u_int64_t *)(loop_buf - 512)) ^= iv[2]; -+ *((u_int64_t *)(loop_buf - 504)) ^= iv[3]; -+ size -= 2*512; -+ devSect += 2; -+ } -+#endif /* defined(HAVE_MD5_2X_IMPLEMENTATION) */ -+ while(size) { -+ /* decrypt one sector at a time */ -+ a = m->keyPtr[((unsigned)devSect) & y]; -+ /* decrypt using fake all-zero IV */ -+ memset(iv, 0, 16); -+ x = 15; -+ do { -+ memcpy(&iv[2], raw_buf, 16); -+ aes_decrypt(a, raw_buf, loop_buf); -+ *((u_int64_t *)(&loop_buf[0])) ^= iv[0]; -+ *((u_int64_t *)(&loop_buf[8])) ^= iv[1]; -+ raw_buf += 16; -+ loop_buf += 16; -+ memcpy(iv, raw_buf, 16); -+ aes_decrypt(a, raw_buf, loop_buf); -+ *((u_int64_t *)(&loop_buf[0])) ^= iv[2]; -+ *((u_int64_t *)(&loop_buf[8])) ^= iv[3]; -+ raw_buf += 16; -+ loop_buf += 16; -+ } while(--x >= 0); -+ if(y) { -+ /* multi-key mode, compute correct IV */ -+ memcpy(iv, &m->partialMD5[0], 16); -+ loop_compute_md5_iv_v3(devSect, (u_int32_t *)iv, (u_int32_t *)(loop_buf - 496)); -+ } else { -+ /* single-key mode, compute correct IV */ -+ loop_compute_sector_iv(devSect, (u_int32_t *)iv); -+ } -+ /* XOR with correct IV now */ -+ *((u_int64_t *)(loop_buf - 512)) ^= iv[0]; -+ *((u_int64_t *)(loop_buf - 504)) ^= iv[1]; -+ size -= 512; -+ devSect++; -+ } -+ } else { -+#if defined(HAVE_MD5_2X_IMPLEMENTATION) && (LINUX_VERSION_CODE >= 0x20400) -+ /* if possible, use faster 2x MD5 implementation, currently AMD64 only (#5) */ -+ while((size >= (2*512)) && y) { -+ /* multi-key mode, encrypt 2 sectors at a time */ -+ memcpy(raw_buf, loop_buf, 2*512); -+ memcpy(&iv[0], &m->partialMD5[0], 16); -+ memcpy(&iv[2], &m->partialMD5[0], 16); -+ loop_compute_md5_iv_v3_2x(devSect, (u_int32_t *)iv, (u_int32_t *)(&raw_buf[16])); -+ /* first sector */ -+ a = m->keyPtr[((unsigned)devSect ) & y]; -+ dip = &iv[0]; -+ x = 15; -+ do { -+ *((u_int64_t *)(&raw_buf[0])) ^= dip[0]; -+ *((u_int64_t *)(&raw_buf[8])) ^= dip[1]; -+ aes_encrypt(a, raw_buf, raw_buf); -+ dip = (u_int64_t *)raw_buf; -+ raw_buf += 16; -+ *((u_int64_t *)(&raw_buf[0])) ^= dip[0]; -+ *((u_int64_t *)(&raw_buf[8])) ^= dip[1]; -+ aes_encrypt(a, raw_buf, raw_buf); -+ dip = (u_int64_t *)raw_buf; -+ raw_buf += 16; -+ } while(--x >= 0); -+ /* second sector */ -+ a = m->keyPtr[((unsigned)devSect + 1) & y]; -+ dip = &iv[2]; -+ x = 15; -+ do { -+ *((u_int64_t *)(&raw_buf[0])) ^= dip[0]; -+ *((u_int64_t *)(&raw_buf[8])) ^= dip[1]; -+ aes_encrypt(a, raw_buf, raw_buf); -+ dip = (u_int64_t *)raw_buf; -+ raw_buf += 16; -+ *((u_int64_t *)(&raw_buf[0])) ^= dip[0]; -+ *((u_int64_t *)(&raw_buf[8])) ^= dip[1]; -+ aes_encrypt(a, raw_buf, raw_buf); -+ dip = (u_int64_t *)raw_buf; -+ raw_buf += 16; -+ } while(--x >= 0); -+ loop_buf += 2*512; -+ size -= 2*512; -+ devSect += 2; -+ } -+#endif /* defined(HAVE_MD5_2X_IMPLEMENTATION) && (LINUX_VERSION_CODE >= 0x20400) */ -+ while(size) { -+ /* encrypt one sector at a time */ -+ a = m->keyPtr[((unsigned)devSect) & y]; -+ if(y) { -+ /* multi-key mode encrypt, linux 2.4 and newer */ -+ memcpy(raw_buf, loop_buf, 512); -+ memcpy(iv, &m->partialMD5[0], 16); -+ loop_compute_md5_iv_v3(devSect, (u_int32_t *)iv, (u_int32_t *)(&raw_buf[16])); -+ dip = iv; -+ x = 15; -+ do { -+ *((u_int64_t *)(&raw_buf[0])) ^= dip[0]; -+ *((u_int64_t *)(&raw_buf[8])) ^= dip[1]; -+ aes_encrypt(a, raw_buf, raw_buf); -+ dip = (u_int64_t *)raw_buf; -+ raw_buf += 16; -+ *((u_int64_t *)(&raw_buf[0])) ^= dip[0]; -+ *((u_int64_t *)(&raw_buf[8])) ^= dip[1]; -+ aes_encrypt(a, raw_buf, raw_buf); -+ dip = (u_int64_t *)raw_buf; -+ raw_buf += 16; -+ } while(--x >= 0); -+ loop_buf += 512; -+ } else { -+ /* single-key mode encrypt */ -+ loop_compute_sector_iv(devSect, (u_int32_t *)iv); -+ dip = iv; -+ x = 15; -+ do { -+ iv[2] = *((u_int64_t *)(&loop_buf[0])) ^ dip[0]; -+ iv[3] = *((u_int64_t *)(&loop_buf[8])) ^ dip[1]; -+ aes_encrypt(a, (unsigned char *)(&iv[2]), raw_buf); -+ dip = (u_int64_t *)raw_buf; -+ loop_buf += 16; -+ raw_buf += 16; -+ iv[2] = *((u_int64_t *)(&loop_buf[0])) ^ dip[0]; -+ iv[3] = *((u_int64_t *)(&loop_buf[8])) ^ dip[1]; -+ aes_encrypt(a, (unsigned char *)(&iv[2]), raw_buf); -+ dip = (u_int64_t *)raw_buf; -+ loop_buf += 16; -+ raw_buf += 16; -+ } while(--x >= 0); -+ } -+ size -= 512; -+ devSect++; -+ } -+ } -+#ifdef CONFIG_BLK_DEV_LOOP_KEYSCRUB -+ read_unlock(&m->rwlock); -+#endif -+ cond_resched(); -+ return(0); -+} -+ -+#if defined(CONFIG_BLK_DEV_LOOP_PADLOCK) && (defined(CONFIG_X86) || defined(CONFIG_X86_64)) -+static __inline__ void padlock_flush_key_context(void) -+{ -+ __asm__ __volatile__("pushf; popf" : : : "cc"); -+} -+ -+static __inline__ void padlock_rep_xcryptcbc(void *cw, void *k, void *s, void *d, void *iv, unsigned long cnt) -+{ -+ __asm__ __volatile__(".byte 0xF3,0x0F,0xA7,0xD0" -+ : "+a" (iv), "+c" (cnt), "+S" (s), "+D" (d) /*output*/ -+ : "b" (k), "d" (cw) /*input*/ -+ : "cc", "memory" /*modified*/ ); -+} -+ -+typedef struct { -+#if defined(HAVE_MD5_2X_IMPLEMENTATION) -+ u_int64_t iv[2*2]; -+#else -+ u_int64_t iv[2]; -+#endif -+ u_int32_t cw[4]; -+ u_int32_t dummy1[4]; -+} Padlock_IV_CW; -+ -+static int transfer_padlock_aes(struct loop_device *lo, int cmd, char *raw_buf, -+ char *loop_buf, int size, sector_t devSect) -+{ -+ aes_context *a; -+ AESmultiKey *m; -+ unsigned y; -+ Padlock_IV_CW ivcwua; -+ Padlock_IV_CW *ivcw; -+ -+ /* ivcw->iv and ivcw->cw must have 16 byte alignment */ -+ ivcw = (Padlock_IV_CW *)(((unsigned long)&ivcwua + 15) & ~((unsigned long)15)); -+ ivcw->cw[3] = ivcw->cw[2] = ivcw->cw[1] = 0; -+ -+ if(!size || (size & 511) || (((unsigned long)raw_buf | (unsigned long)loop_buf) & 15)) { -+ return -EINVAL; -+ } -+ m = (AESmultiKey *)lo->key_data; -+ y = m->keyMask; -+#ifdef CONFIG_BLK_DEV_LOOP_KEYSCRUB -+ read_lock(&m->rwlock); -+#endif -+ if(cmd == READ) { -+ ivcw->cw[0] = m->padlock_cw_d; -+#if defined(HAVE_MD5_2X_IMPLEMENTATION) -+ /* if possible, use faster 2x MD5 implementation, currently AMD64 only (#4) */ -+ while((size >= (2*512)) && y) { -+ /* decrypt using fake all-zero IV */ -+ memset(&ivcw->iv[0], 0, 2*16); -+ a = m->keyPtr[((unsigned)devSect ) & y]; -+ padlock_flush_key_context(); -+ padlock_rep_xcryptcbc(&ivcw->cw[0], &a->aes_d_key[0], raw_buf, loop_buf, &ivcw->iv[0], 32); -+ a = m->keyPtr[((unsigned)devSect + 1) & y]; -+ padlock_flush_key_context(); -+ padlock_rep_xcryptcbc(&ivcw->cw[0], &a->aes_d_key[0], raw_buf + 512, loop_buf + 512, &ivcw->iv[2], 32); -+ /* compute correct IV */ -+ memcpy(&ivcw->iv[0], &m->partialMD5[0], 16); -+ memcpy(&ivcw->iv[2], &m->partialMD5[0], 16); -+ loop_compute_md5_iv_v3_2x(devSect, (u_int32_t *)(&ivcw->iv[0]), (u_int32_t *)(&loop_buf[16])); -+ /* XOR with correct IV now */ -+ *((u_int64_t *)(&loop_buf[0])) ^= ivcw->iv[0]; -+ *((u_int64_t *)(&loop_buf[8])) ^= ivcw->iv[1]; -+ *((u_int64_t *)(&loop_buf[512 + 0])) ^= ivcw->iv[2]; -+ *((u_int64_t *)(&loop_buf[512 + 8])) ^= ivcw->iv[3]; -+ size -= 2*512; -+ raw_buf += 2*512; -+ loop_buf += 2*512; -+ devSect += 2; -+ } -+#endif /* defined(HAVE_MD5_2X_IMPLEMENTATION) */ -+ while(size) { -+ a = m->keyPtr[((unsigned)devSect) & y]; -+ padlock_flush_key_context(); -+ if(y) { -+ /* decrypt using fake all-zero IV */ -+ memset(&ivcw->iv[0], 0, 16); -+ padlock_rep_xcryptcbc(&ivcw->cw[0], &a->aes_d_key[0], raw_buf, loop_buf, &ivcw->iv[0], 32); -+ /* compute correct IV */ -+ memcpy(&ivcw->iv[0], &m->partialMD5[0], 16); -+ loop_compute_md5_iv_v3(devSect, (u_int32_t *)(&ivcw->iv[0]), (u_int32_t *)(&loop_buf[16])); -+ /* XOR with correct IV now */ -+ *((u_int64_t *)(&loop_buf[ 0])) ^= ivcw->iv[0]; -+ *((u_int64_t *)(&loop_buf[ 8])) ^= ivcw->iv[1]; -+ } else { -+ loop_compute_sector_iv(devSect, (u_int32_t *)(&ivcw->iv[0])); -+ padlock_rep_xcryptcbc(&ivcw->cw[0], &a->aes_d_key[0], raw_buf, loop_buf, &ivcw->iv[0], 32); -+ } -+ size -= 512; -+ raw_buf += 512; -+ loop_buf += 512; -+ devSect++; -+ } -+ } else { -+ ivcw->cw[0] = m->padlock_cw_e; -+#if defined(HAVE_MD5_2X_IMPLEMENTATION) -+ /* if possible, use faster 2x MD5 implementation, currently AMD64 only (#3) */ -+ while((size >= (2*512)) && y) { -+ memcpy(raw_buf, loop_buf, 2*512); -+ memcpy(&ivcw->iv[0], &m->partialMD5[0], 16); -+ memcpy(&ivcw->iv[2], &m->partialMD5[0], 16); -+ loop_compute_md5_iv_v3_2x(devSect, (u_int32_t *)(&ivcw->iv[0]), (u_int32_t *)(&raw_buf[16])); -+ a = m->keyPtr[((unsigned)devSect ) & y]; -+ padlock_flush_key_context(); -+ padlock_rep_xcryptcbc(&ivcw->cw[0], &a->aes_e_key[0], raw_buf, raw_buf, &ivcw->iv[0], 32); -+ a = m->keyPtr[((unsigned)devSect + 1) & y]; -+ padlock_flush_key_context(); -+ padlock_rep_xcryptcbc(&ivcw->cw[0], &a->aes_e_key[0], raw_buf + 512, raw_buf + 512, &ivcw->iv[2], 32); -+ size -= 2*512; -+ raw_buf += 2*512; -+ loop_buf += 2*512; -+ devSect += 2; -+ } -+#endif /* defined(HAVE_MD5_2X_IMPLEMENTATION) */ -+ while(size) { -+ a = m->keyPtr[((unsigned)devSect) & y]; -+ padlock_flush_key_context(); -+ if(y) { -+ memcpy(raw_buf, loop_buf, 512); -+ memcpy(&ivcw->iv[0], &m->partialMD5[0], 16); -+ loop_compute_md5_iv_v3(devSect, (u_int32_t *)(&ivcw->iv[0]), (u_int32_t *)(&raw_buf[16])); -+ padlock_rep_xcryptcbc(&ivcw->cw[0], &a->aes_e_key[0], raw_buf, raw_buf, &ivcw->iv[0], 32); -+ } else { -+ loop_compute_sector_iv(devSect, (u_int32_t *)(&ivcw->iv[0])); -+ padlock_rep_xcryptcbc(&ivcw->cw[0], &a->aes_e_key[0], loop_buf, raw_buf, &ivcw->iv[0], 32); -+ } -+ size -= 512; -+ raw_buf += 512; -+ loop_buf += 512; -+ devSect++; -+ } -+ } -+#ifdef CONFIG_BLK_DEV_LOOP_KEYSCRUB -+ read_unlock(&m->rwlock); -+#endif -+ cond_resched(); -+ return(0); -+} -+#endif -+ -+#if defined(CONFIG_BLK_DEV_LOOP_INTELAES) && (defined(CONFIG_X86) || defined(CONFIG_X86_64)) -+asmlinkage extern void intel_aes_cbc_encrypt(const aes_context *, void *src, void *dst, size_t len, void *iv); -+asmlinkage extern void intel_aes_cbc_decrypt(const aes_context *, void *src, void *dst, size_t len, void *iv); -+asmlinkage extern void intel_aes_cbc_enc_4x512(aes_context **, void *src, void *dst, void *iv); -+ -+static int transfer_intel_aes(struct loop_device *lo, int cmd, char *raw_buf, -+ char *loop_buf, int size, sector_t devSect) -+{ -+ aes_context *acpa[4]; -+ AESmultiKey *m; -+ unsigned y; -+ u_int64_t ivua[(4*2)+2]; -+ u_int64_t *iv; -+ -+ /* make iv 16 byte aligned */ -+ iv = (u_int64_t *)(((unsigned long)&ivua + 15) & ~((unsigned long)15)); -+ -+ if(!size || (size & 511) || (((unsigned long)raw_buf | (unsigned long)loop_buf) & 15)) { -+ return -EINVAL; -+ } -+ m = (AESmultiKey *)lo->key_data; -+ y = m->keyMask; -+#ifdef CONFIG_BLK_DEV_LOOP_KEYSCRUB -+ read_lock(&m->rwlock); -+#endif -+ kernel_fpu_begin(); /* intel_aes_* code uses xmm registers */ -+ if(cmd == READ) { -+#if defined(HAVE_MD5_2X_IMPLEMENTATION) -+ /* if possible, use faster 2x MD5 implementation, currently AMD64 only (#2) */ -+ while((size >= (2*512)) && y) { -+ acpa[0] = m->keyPtr[((unsigned)devSect ) & y]; -+ acpa[1] = m->keyPtr[((unsigned)devSect + 1) & y]; -+ /* decrypt using fake all-zero IV */ -+ memset(iv, 0, 2*16); -+ intel_aes_cbc_decrypt(acpa[0], raw_buf, loop_buf, 512, &iv[0]); -+ intel_aes_cbc_decrypt(acpa[1], raw_buf + 512, loop_buf + 512, 512, &iv[2]); -+ /* compute correct IV, use 2x parallelized version */ -+ memcpy(&iv[0], &m->partialMD5[0], 16); -+ memcpy(&iv[2], &m->partialMD5[0], 16); -+ loop_compute_md5_iv_v3_2x(devSect, (u_int32_t *)iv, (u_int32_t *)(&loop_buf[16])); -+ /* XOR with correct IV now */ -+ *((u_int64_t *)(&loop_buf[0])) ^= iv[0]; -+ *((u_int64_t *)(&loop_buf[8])) ^= iv[1]; -+ *((u_int64_t *)(&loop_buf[512 + 0])) ^= iv[2]; -+ *((u_int64_t *)(&loop_buf[512 + 8])) ^= iv[3]; -+ size -= 2*512; -+ raw_buf += 2*512; -+ loop_buf += 2*512; -+ devSect += 2; -+ } -+#endif /* defined(HAVE_MD5_2X_IMPLEMENTATION) */ -+ while(size) { -+ acpa[0] = m->keyPtr[((unsigned)devSect) & y]; -+ if(y) { -+ /* decrypt using fake all-zero IV */ -+ memset(iv, 0, 16); -+ intel_aes_cbc_decrypt(acpa[0], raw_buf, loop_buf, 512, iv); -+ /* compute correct IV */ -+ memcpy(iv, &m->partialMD5[0], 16); -+ loop_compute_md5_iv_v3(devSect, (u_int32_t *)iv, (u_int32_t *)(&loop_buf[16])); -+ /* XOR with correct IV now */ -+ *((u_int64_t *)(&loop_buf[0])) ^= iv[0]; -+ *((u_int64_t *)(&loop_buf[8])) ^= iv[1]; -+ } else { -+ loop_compute_sector_iv(devSect, (u_int32_t *)iv); -+ intel_aes_cbc_decrypt(acpa[0], raw_buf, loop_buf, 512, iv); -+ } -+ size -= 512; -+ raw_buf += 512; -+ loop_buf += 512; -+ devSect++; -+ } -+ } else { -+ /* if possible, use faster 4-chains at a time encrypt implementation (#1) */ -+ while(size >= (4*512)) { -+ acpa[0] = m->keyPtr[((unsigned)devSect ) & y]; -+ acpa[1] = m->keyPtr[((unsigned)devSect + 1) & y]; -+ acpa[2] = m->keyPtr[((unsigned)devSect + 2) & y]; -+ acpa[3] = m->keyPtr[((unsigned)devSect + 3) & y]; -+ if(y) { -+ memcpy(raw_buf, loop_buf, 4*512); -+ memcpy(&iv[0], &m->partialMD5[0], 16); -+ memcpy(&iv[2], &m->partialMD5[0], 16); -+ memcpy(&iv[4], &m->partialMD5[0], 16); -+ memcpy(&iv[6], &m->partialMD5[0], 16); -+#if defined(HAVE_MD5_2X_IMPLEMENTATION) -+ /* use 2x parallelized version */ -+ loop_compute_md5_iv_v3_2x(devSect, (u_int32_t *)(&iv[0]), (u_int32_t *)(&raw_buf[ 16])); -+ loop_compute_md5_iv_v3_2x(devSect + 2, (u_int32_t *)(&iv[4]), (u_int32_t *)(&raw_buf[0x400 + 16])); -+#else -+ loop_compute_md5_iv_v3(devSect, (u_int32_t *)(&iv[0]), (u_int32_t *)(&raw_buf[ 16])); -+ loop_compute_md5_iv_v3(devSect + 1, (u_int32_t *)(&iv[2]), (u_int32_t *)(&raw_buf[0x200 + 16])); -+ loop_compute_md5_iv_v3(devSect + 2, (u_int32_t *)(&iv[4]), (u_int32_t *)(&raw_buf[0x400 + 16])); -+ loop_compute_md5_iv_v3(devSect + 3, (u_int32_t *)(&iv[6]), (u_int32_t *)(&raw_buf[0x600 + 16])); -+#endif -+ intel_aes_cbc_enc_4x512(&acpa[0], raw_buf, raw_buf, iv); -+ } else { -+ loop_compute_sector_iv(devSect, (u_int32_t *)(&iv[0])); -+ loop_compute_sector_iv(devSect + 1, (u_int32_t *)(&iv[2])); -+ loop_compute_sector_iv(devSect + 2, (u_int32_t *)(&iv[4])); -+ loop_compute_sector_iv(devSect + 3, (u_int32_t *)(&iv[6])); -+ intel_aes_cbc_enc_4x512(&acpa[0], loop_buf, raw_buf, iv); -+ } -+ size -= 4*512; -+ raw_buf += 4*512; -+ loop_buf += 4*512; -+ devSect += 4; -+ } -+ /* encrypt the rest (if any) using slower 1-chain at a time implementation */ -+ while(size) { -+ acpa[0] = m->keyPtr[((unsigned)devSect) & y]; -+ if(y) { -+ memcpy(raw_buf, loop_buf, 512); -+ memcpy(iv, &m->partialMD5[0], 16); -+ loop_compute_md5_iv_v3(devSect, (u_int32_t *)iv, (u_int32_t *)(&raw_buf[16])); -+ intel_aes_cbc_encrypt(acpa[0], raw_buf, raw_buf, 512, iv); -+ } else { -+ loop_compute_sector_iv(devSect, (u_int32_t *)iv); -+ intel_aes_cbc_encrypt(acpa[0], loop_buf, raw_buf, 512, iv); -+ } -+ size -= 512; -+ raw_buf += 512; -+ loop_buf += 512; -+ devSect++; -+ } -+ } -+ kernel_fpu_end(); /* intel_aes_* code uses xmm registers */ -+#ifdef CONFIG_BLK_DEV_LOOP_KEYSCRUB -+ read_unlock(&m->rwlock); -+#endif -+ cond_resched(); -+ return(0); -+} -+#endif -+ -+static struct loop_func_table funcs_aes = { -+ number: 16, /* 16 == AES */ -+ transfer: transfer_aes, -+ init: keySetup_aes, -+ release: keyClean_aes, -+ ioctl: handleIoctl_aes -+}; -+ -+#if defined(CONFIG_BLK_DEV_LOOP_PADLOCK) && (defined(CONFIG_X86) || defined(CONFIG_X86_64)) -+static struct loop_func_table funcs_padlock_aes = { -+ number: 16, /* 16 == AES */ -+ transfer: transfer_padlock_aes, -+ init: keySetup_aes, -+ release: keyClean_aes, -+ ioctl: handleIoctl_aes -+}; -+#endif -+ -+#if defined(CONFIG_BLK_DEV_LOOP_INTELAES) && (defined(CONFIG_X86) || defined(CONFIG_X86_64)) -+static struct loop_func_table funcs_intel_aes = { -+ number: 16, /* 16 == AES */ -+ transfer: transfer_intel_aes, -+ init: keySetup_aes, -+ release: keyClean_aes, -+ ioctl: handleIoctl_aes -+}; -+#endif -+ -+#if defined(CONFIG_BLK_DEV_LOOP_PADLOCK) && (defined(CONFIG_X86) || defined(CONFIG_X86_64)) -+static int CentaurHauls_ID_and_enabled_ACE(void) -+{ -+ unsigned int eax = 0, ebx = 0, ecx = 0, edx = 0; -+ -+ /* check for "CentaurHauls" ID string, and enabled ACE */ -+ cpuid(0x00000000, &eax, &ebx, &ecx, &edx); -+ if((ebx == 0x746e6543) && (edx == 0x48727561) && (ecx == 0x736c7561) -+ && (cpuid_eax(0xC0000000) >= 0xC0000001) -+ && ((cpuid_edx(0xC0000001) & 0xC0) == 0xC0)) { -+ return 1; /* ACE enabled */ -+ } -+ return 0; -+} -+#endif -+ -+EXPORT_SYMBOL(loop_compute_sector_iv); -+EXPORT_SYMBOL(loop_compute_md5_iv_v3); -+EXPORT_SYMBOL(loop_compute_md5_iv); -+EXPORT_SYMBOL(md5_transform_CPUbyteorder_C); -+#endif /* CONFIG_BLK_DEV_LOOP_AES */ -+ -+/* xfer_funcs[0] is special - its release function is never called */ -+static struct loop_func_table *xfer_funcs[MAX_LO_CRYPT] = { -+ &none_funcs, -+ &xor_funcs, -+#ifdef CONFIG_BLK_DEV_LOOP_AES -+ [LO_CRYPT_AES] = &funcs_aes, -+#endif -+}; -+ -+/* -+ * First number of 'lo_prealloc' is the default number of RAM pages -+ * to pre-allocate for each device backed loop. Every (configured) -+ * device backed loop pre-allocates this amount of RAM pages unless -+ * later 'lo_prealloc' numbers provide an override. 'lo_prealloc' -+ * overrides are defined in pairs: loop_index,number_of_pages -+ */ -+static int lo_prealloc[9] = { 256, -1, 0, -1, 0, -1, 0, -1, 0 }; -+#define LO_PREALLOC_MIN 4 /* minimum user defined pre-allocated RAM pages */ -+#define LO_PREALLOC_MAX 4096 /* maximum user defined pre-allocated RAM pages */ -+ -+#ifdef MODULE -+static int dummy1; -+module_param_array(lo_prealloc, int, &dummy1, 0); -+MODULE_PARM_DESC(lo_prealloc, "Number of pre-allocated pages [,index,pages]..."); -+#else -+static int __init lo_prealloc_setup(char *str) -+{ -+ int x, y, z; -+ -+ for (x = 0; x < (sizeof(lo_prealloc) / sizeof(int)); x++) { -+ z = get_option(&str, &y); -+ if (z > 0) -+ lo_prealloc[x] = y; -+ if (z < 2) -+ break; -+ } -+ return 1; -+} -+__setup("lo_prealloc=", lo_prealloc_setup); -+#endif -+ -+/* -+ * First number of 'lo_threads' is the default number of helper threads to -+ * create for each device backed loop device. Every (configured) device -+ * backed loop device has this many threads unless later 'lo_threads' -+ * numbers provide an override. File backed loops always have 1 helper -+ * thread. 'lo_threads' overrides are defined in pairs: loop_index,threads -+ * -+ * This value is ignored on 2.6.18 and older kernels. -+ */ -+static int lo_threads[9] = { 1, -1, 0, -1, 0, -1, 0, -1, 0 }; -+#define LO_THREADS_MIN 1 /* minimum user defined thread count */ -+#define LO_THREADS_MAX 4 /* maximum user defined thread count */ -+ -+#ifdef MODULE -+static int dummy2; -+module_param_array(lo_threads, int, &dummy2, 0); -+MODULE_PARM_DESC(lo_threads, "Number of threads per loop [,index,threads]..."); -+#else -+static int __init lo_threads_setup(char *str) -+{ -+ int x, y, z; -+ -+ for (x = 0; x < (sizeof(lo_threads) / sizeof(int)); x++) { -+ z = get_option(&str, &y); -+ if (z > 0) -+ lo_threads[x] = y; -+ if (z < 2) -+ break; -+ } -+ return 1; -+} -+__setup("lo_threads=", lo_threads_setup); -+#endif -+ -+/* -+ * This is loop helper thread nice value in range -+ * from 0 (low priority) to -20 (high priority). -+ */ -+static int lo_nice = -1; -+ -+#ifdef MODULE -+module_param(lo_nice, int, 0); -+MODULE_PARM_DESC(lo_nice, "Loop thread scheduler nice (0 ... -20)"); -+#else -+static int __init lo_nice_setup(char *str) -+{ -+ int y; -+ -+ if (get_option(&str, &y) == 1) -+ lo_nice = y; -+ return 1; -+} -+__setup("lo_nice=", lo_nice_setup); -+#endif -+ -+struct loop_bio_extension { -+ struct bio *bioext_merge; -+ struct loop_device *bioext_loop; -+ struct bio_vec *bioext_bi_io_vec_orig; -+ sector_t bioext_iv; -+ int bioext_index; -+ int bioext_size; -+ unsigned int bioext_bi_max_vecs_orig; -+}; -+ -+static struct loop_device **loop_dev_ptr_arr; -+ -+static void loop_prealloc_cleanup(struct loop_device *lo) -+{ -+ struct bio *bio; -+ struct loop_bio_extension *extension; -+ -+ while ((bio = lo->lo_bio_free0)) { -+ lo->lo_bio_free0 = bio->bi_next; -+ extension = bio->bi_private; -+ bio->bi_io_vec = extension->bioext_bi_io_vec_orig; -+ bio->bi_max_vecs = extension->bioext_bi_max_vecs_orig; -+ bio->bi_vcnt = 1; -+ __free_page(bio->bi_io_vec[0].bv_page); -+ kfree(extension); -+ bio->bi_next = NULL; -+ bio_put(bio); -+ } -+ while ((bio = lo->lo_bio_free1)) { -+ lo->lo_bio_free1 = bio->bi_next; -+ /* bi_flags bit 0 was used for other purpose */ -+ clear_bit(0, &bio->bi_flags); -+ /* bi_size was used for other purpose */ -+ bio->bi_size = 0; -+ /* bi_cnt was used for other purpose */ -+ atomic_set(&bio->bi_cnt, 1); -+ bio->bi_next = NULL; -+ bio_put(bio); -+ } -+} -+ -+static int loop_prealloc_init(struct loop_device *lo, int y) -+{ -+ struct bio *bio; -+ struct loop_bio_extension *extension; -+ int x; -+ -+ if(!y) { -+ y = lo_prealloc[0]; -+ for (x = 1; x < (sizeof(lo_prealloc) / sizeof(int)); x += 2) { -+ if (lo_prealloc[x + 1] && (lo->lo_number == lo_prealloc[x])) { -+ y = lo_prealloc[x + 1]; -+ break; -+ } -+ } -+ } -+ lo->lo_bio_flshMax = (y * 3) / 4; -+ lo->lo_bio_flshCnt = 0; -+ -+ for (x = 0; x < y; x++) { -+ bio = bio_alloc(GFP_KERNEL, 1); -+ if (!bio) { -+ fail1: -+ loop_prealloc_cleanup(lo); -+ return 1; -+ } -+ bio->bi_io_vec[0].bv_page = alloc_page(GFP_KERNEL); -+ if (!bio->bi_io_vec[0].bv_page) { -+ fail2: -+ bio->bi_next = NULL; -+ bio_put(bio); -+ goto fail1; -+ } -+ memset(page_address(bio->bi_io_vec[0].bv_page), 0, PAGE_SIZE); -+ bio->bi_vcnt = 1; -+ extension = kmalloc(sizeof(struct loop_bio_extension), GFP_KERNEL); -+ if (!extension) { -+ __free_page(bio->bi_io_vec[0].bv_page); -+ goto fail2; -+ } -+ bio->bi_private = extension; -+ extension->bioext_bi_io_vec_orig = bio->bi_io_vec; -+ extension->bioext_bi_max_vecs_orig = bio->bi_max_vecs; -+ bio->bi_next = lo->lo_bio_free0; -+ lo->lo_bio_free0 = bio; -+ -+ bio = bio_alloc(GFP_KERNEL, 1); -+ if (!bio) -+ goto fail1; -+ bio->bi_vcnt = 1; -+ bio->bi_next = lo->lo_bio_free1; -+ lo->lo_bio_free1 = bio; -+ } -+ return 0; -+} -+ -+static void loop_add_queue_last(struct loop_device *lo, struct bio *bio, struct bio **q) -+{ -+ unsigned long flags; -+ -+ spin_lock_irqsave(&lo->lo_lock, flags); -+ if (*q) { -+ bio->bi_next = (*q)->bi_next; -+ (*q)->bi_next = bio; -+ } else { -+ bio->bi_next = bio; -+ } -+ *q = bio; -+ spin_unlock_irqrestore(&lo->lo_lock, flags); -+ -+ if (waitqueue_active(&lo->lo_bio_wait)) -+ wake_up_interruptible_all(&lo->lo_bio_wait); -+} -+ -+static struct bio *loop_get_bio(struct loop_device *lo) -+{ -+ struct bio *bio = NULL, *last; -+ -+ spin_lock_irq(&lo->lo_lock); -+ if ((last = lo->lo_bio_que0)) { -+ bio = last->bi_next; -+ if (bio == last) -+ lo->lo_bio_que0 = NULL; -+ else -+ last->bi_next = bio->bi_next; -+ bio->bi_next = NULL; -+ } -+ spin_unlock_irq(&lo->lo_lock); -+ return bio; -+} -+ -+static void loop_put_buffer(struct loop_device *lo, struct bio *b, int flist) -+{ -+ unsigned long flags; -+ -+ spin_lock_irqsave(&lo->lo_lock, flags); -+ if(!flist) { -+ b->bi_next = lo->lo_bio_free0; -+ lo->lo_bio_free0 = b; -+ } else { -+ b->bi_next = lo->lo_bio_free1; -+ lo->lo_bio_free1 = b; -+ } -+ spin_unlock_irqrestore(&lo->lo_lock, flags); -+ -+ if (waitqueue_active(&lo->lo_buf_wait)) -+ wake_up_all(&lo->lo_buf_wait); -+} -+ -+static void loop_end_io_transfer(struct bio *bio, int err) -+{ -+ struct loop_bio_extension *extension = bio->bi_private; -+ struct bio *merge = extension->bioext_merge; -+ struct loop_device *lo = extension->bioext_loop; -+ struct bio *origbio = merge->bi_private; -+ -+ if (err) { -+ merge->bi_size = err; /* used as error code */ -+ if(err == -EIO) -+ clear_bit(0, &merge->bi_flags); -+ printk(KERN_ERR "loop%d: loop_end_io_transfer err=%d bi_rw=0x%lx\n", lo->lo_number, err, bio->bi_rw); -+ } -+ if (bio_rw(bio) == WRITE) { -+ loop_put_buffer(lo, bio, 0); -+ if (!atomic_dec_and_test(&merge->bi_cnt)) { -+ return; -+ } -+ origbio->bi_next = NULL; -+ bio_endio(origbio, test_bit(0, &merge->bi_flags) ? (int)merge->bi_size : -EIO); -+ loop_put_buffer(lo, merge, 1); -+ if (atomic_dec_and_test(&lo->lo_pending)) -+ wake_up_interruptible_all(&lo->lo_bio_wait); -+ } else { -+ loop_add_queue_last(lo, bio, &lo->lo_bio_que0); -+ } -+} -+ -+static struct bio *loop_get_buffer(struct loop_device *lo, struct bio *orig_bio, -+ struct bio **merge_ptr, int *flushPtr) -+{ -+ struct bio *bio = NULL, *merge = *merge_ptr, *fbtst; -+ struct loop_bio_extension *extension; -+ int len, nzCnt, flsh = 0, firstVec, lastVec; -+ -+ spin_lock_irq(&lo->lo_lock); -+ if (!merge) { -+ merge = lo->lo_bio_free1; -+ if (merge) { -+ lo->lo_bio_free1 = merge->bi_next; -+ } -+ } -+ if (merge) { -+ bio = lo->lo_bio_free0; -+ if (bio) { -+ lo->lo_bio_free0 = bio->bi_next; -+ } -+ } -+ fbtst = lo->lo_bio_free0; -+ if(!fbtst || !fbtst->bi_next) { -+ flsh = 1; -+ } -+ fbtst = lo->lo_bio_free1; -+ if(!fbtst || !fbtst->bi_next) { -+ flsh = 1; -+ } -+ spin_unlock_irq(&lo->lo_lock); -+ -+ *flushPtr = flsh; -+ -+ if (!(*merge_ptr) && merge) { -+ /* -+ * initialize "merge-bio" which is used as -+ * rendezvous point among multiple vecs -+ */ -+ *merge_ptr = merge; -+ merge->bi_sector = orig_bio->bi_sector + lo->lo_offs_sec; -+ merge->bi_size = 0; /* used as error code */ -+ set_bit(0, &merge->bi_flags); -+ merge->bi_idx = orig_bio->bi_idx; -+ nzCnt = orig_bio->bi_vcnt - orig_bio->bi_idx; -+ if(nzCnt < 1) nzCnt = 1; -+ atomic_set(&merge->bi_cnt, nzCnt); -+ merge->bi_private = orig_bio; -+ } -+ -+ if (!bio) -+ return NULL; -+ -+ extension = bio->bi_private; -+ firstVec = (!orig_bio->bi_vcnt || (merge->bi_idx == orig_bio->bi_idx)) ? 1 : 0; -+ lastVec = (!orig_bio->bi_vcnt || (merge->bi_idx == (orig_bio->bi_vcnt - 1))) ? 1 : 0; -+ -+ /* -+ * initialize one page "buffer-bio" -+ */ -+#if LINUX_VERSION_CODE >= 0x30700 -+ bio_reset(bio); -+ bio->bi_private = extension; -+#else -+#if !defined(BIO_RESET_BITS) -+# define BIO_RESET_BITS BIO_POOL_OFFSET -+#endif -+ bio->bi_flags &= (~0UL << BIO_RESET_BITS); -+ bio->bi_flags |= (1 << BIO_UPTODATE); -+#endif -+ bio->bi_sector = merge->bi_sector; -+ bio->bi_next = NULL; -+ bio->bi_bdev = lo->lo_device; -+ -+#if LINUX_VERSION_CODE < 0x30200 -+ if(orig_bio->bi_flags & (1 << BIO_CPU_AFFINE)) { -+ bio->bi_comp_cpu = orig_bio->bi_comp_cpu; -+ bio->bi_flags |= (1 << BIO_CPU_AFFINE); -+ } -+#endif -+ /* read-ahead bit needs to be cleared to work around kernel bug */ -+ /* that causes I/O errors on -EWOULDBLOCK I/O elevator failures */ -+ bio->bi_rw = orig_bio->bi_rw & ~L_BIO_RW_AHEAD; -+ -+ if(orig_bio->bi_rw & REQ_FLUSH) { -+ if(!firstVec) { -+ bio->bi_rw &= ~REQ_FLUSH; -+ } else { -+ *flushPtr = 1; -+ } -+ } -+ if(orig_bio->bi_rw & REQ_FUA) { -+ if(!lastVec) { -+ bio->bi_rw &= ~REQ_FUA; -+ } else { -+ *flushPtr = 1; -+ } -+ } -+ if(orig_bio->bi_rw & L_BIO_RW_SYNCIO) { -+ if(!lastVec) { -+ bio->bi_rw &= ~L_BIO_RW_SYNCIO; -+ } else { -+ *flushPtr = 1; -+ } -+ } -+ if(orig_bio->bi_rw & L_BIO_RW_NOIDLE) { -+ if(!lastVec) { -+ bio->bi_rw &= ~L_BIO_RW_NOIDLE; -+ } -+ } -+ if(flsh) { -+ bio->bi_rw |= L_BIO_RW_NOIDLE; -+ } -+ -+ bio->bi_idx = 0; -+ bio->bi_phys_segments = 0; -+ if(orig_bio->bi_io_vec && orig_bio->bi_vcnt) { -+ /* original bio has data */ -+ bio->bi_io_vec = extension->bioext_bi_io_vec_orig; -+ bio->bi_max_vecs = extension->bioext_bi_max_vecs_orig; -+ bio->bi_vcnt = 1; -+ bio->bi_size = len = orig_bio->bi_io_vec[merge->bi_idx].bv_len; -+ bio->bi_io_vec[0].bv_len = len; -+ bio->bi_io_vec[0].bv_offset = 0; -+ } else { -+ /* original bio does not have data */ -+ bio->bi_io_vec = 0; /* bio_has_data() expects this to be zero */ -+ bio->bi_max_vecs = 0; /* __bio_clone() expects this to be zero */ -+ bio->bi_vcnt = 0; -+ bio->bi_size = len = 0; -+ } -+ -+ bio->bi_seg_front_size = 0; -+ bio->bi_seg_back_size = 0; -+ bio->bi_end_io = loop_end_io_transfer; -+ -+ /* -+ * initialize "buffer-bio" extension. This extension is -+ * permanently glued to above "buffer-bio" via bio->bi_private -+ */ -+ extension->bioext_merge = merge; -+ extension->bioext_loop = lo; -+ extension->bioext_iv = merge->bi_sector - lo->lo_iv_remove; -+ extension->bioext_index = merge->bi_idx; -+ extension->bioext_size = len; -+ -+ /* -+ * prepare "merge-bio" for next vec -+ */ -+ merge->bi_sector += len >> 9; -+ merge->bi_idx++; -+ -+ return bio; -+} -+ -+static int figure_loop_size(struct loop_device *lo, struct block_device *bdev) -+{ -+ loff_t size, offs; -+ sector_t x; -+ int err = 0; -+ -+ size = i_size_read(lo->lo_backing_file->f_path.dentry->d_inode->i_mapping->host); -+ offs = lo->lo_offset; -+ if (!(lo->lo_flags & LO_FLAGS_DO_BMAP)) -+ offs &= ~((loff_t)511); -+ if ((offs > 0) && (offs < size)) { -+ size -= offs; -+ } else { -+ if (offs) -+ err = -EINVAL; -+ lo->lo_offset = 0; -+ lo->lo_offs_sec = lo->lo_iv_remove = 0; -+ } -+ if ((lo->lo_sizelimit > 0) && (lo->lo_sizelimit <= size)) { -+ size = lo->lo_sizelimit; -+ } else { -+ if (lo->lo_sizelimit) -+ err = -EINVAL; -+ lo->lo_sizelimit = 0; -+ } -+ size >>= 9; -+ -+ /* -+ * Unfortunately, if we want to do I/O on the device, -+ * the number of 512-byte sectors has to fit into a sector_t. -+ */ -+ x = (sector_t)size; -+ if ((loff_t)x != size) { -+ err = -EFBIG; -+ size = 0; -+ } -+ -+ set_capacity(disks[lo->lo_number], size); /* 512 byte units */ -+ i_size_write(bdev->bd_inode, size << 9); /* byte units */ -+ return err; -+} -+ -+static inline int lo_do_transfer(struct loop_device *lo, int cmd, char *rbuf, -+ char *lbuf, int size, sector_t rblock) -+{ -+ if (!lo->transfer) -+ return 0; -+ -+ return lo->transfer(lo, cmd, rbuf, lbuf, size, rblock); -+} -+ -+static int loop_file_io(struct file *file, char *buf, int size, loff_t *ppos, int w) -+{ -+ mm_segment_t fs; -+ int x, y, z; -+ -+ y = 0; -+ do { -+ z = size - y; -+ fs = get_fs(); -+ set_fs(get_ds()); -+ if (w) { -+ x = file->f_op->write(file, buf + y, z, ppos); -+ set_fs(fs); -+ } else { -+ x = file->f_op->read(file, buf + y, z, ppos); -+ set_fs(fs); -+ if (!x) -+ return 1; -+ } -+ if (x < 0) { -+ if ((x == -EAGAIN) || (x == -ENOMEM) || (x == -ERESTART) || (x == -EINTR)) { -+ set_current_state(TASK_INTERRUPTIBLE); -+ schedule_timeout(HZ / 2); -+ continue; -+ } -+ return 1; -+ } -+ y += x; -+ } while (y < size); -+ return 0; -+} -+ -+static int do_bio_filebacked(struct loop_device *lo, struct bio *bio) -+{ -+ loff_t pos; -+ struct file *file = lo->lo_backing_file; -+ char *data, *buf; -+ unsigned int size, len; -+ sector_t IV; -+ struct page *pg; -+ -+ if(!bio->bi_io_vec || !bio->bi_vcnt) -+ return 0; -+ -+ pos = ((loff_t) bio->bi_sector << 9) + lo->lo_offset; -+ buf = page_address(lo->lo_bio_free0->bi_io_vec[0].bv_page); -+ IV = bio->bi_sector; -+ if (!lo->lo_iv_remove) -+ IV += lo->lo_offs_sec; -+ do { -+ pg = bio->bi_io_vec[bio->bi_idx].bv_page; -+ len = bio->bi_io_vec[bio->bi_idx].bv_len; -+ data = kmap(pg) + bio->bi_io_vec[bio->bi_idx].bv_offset; -+ while (len > 0) { -+ if (!lo->lo_encryption) { -+ /* this code relies that NONE transfer is a no-op */ -+ buf = data; -+ } -+ size = PAGE_CACHE_SIZE; -+ if (size > len) -+ size = len; -+ if (bio_rw(bio) == WRITE) { -+ if (lo_do_transfer(lo, WRITE, buf, data, size, IV)) { -+ printk(KERN_ERR "loop%d: write transfer error, sector %llu\n", lo->lo_number, (unsigned long long)IV); -+ goto kunmap_and_out; -+ } -+ if (loop_file_io(file, buf, size, &pos, 1)) { -+ printk(KERN_ERR "loop%d: write i/o error, sector %llu\n", lo->lo_number, (unsigned long long)IV); -+ goto kunmap_and_out; -+ } -+ } else { -+ if (loop_file_io(file, buf, size, &pos, 0)) { -+ printk(KERN_ERR "loop%d: read i/o error, sector %llu\n", lo->lo_number, (unsigned long long)IV); -+ goto kunmap_and_out; -+ } -+ if (lo_do_transfer(lo, READ, buf, data, size, IV)) { -+ printk(KERN_ERR "loop%d: read transfer error, sector %llu\n", lo->lo_number, (unsigned long long)IV); -+ goto kunmap_and_out; -+ } -+ flush_dcache_page(pg); -+ } -+ data += size; -+ len -= size; -+ IV += size >> 9; -+ } -+ kunmap(pg); -+ } while (++bio->bi_idx < bio->bi_vcnt); -+ return 0; -+ -+kunmap_and_out: -+ kunmap(pg); -+ return -EIO; -+} -+ -+static void loop_unplug_backingdev(struct request_queue *bq) -+{ -+ struct blk_plug *plug = current->plug; -+ if(plug) { -+ /* A thread may sleep and wait for new buffers from previously submitted requests. */ -+ /* Make sure requests are actually sent to backing device, and not just queued. */ -+ struct bio_list *blistTmp = current->bio_list; -+ current->bio_list = NULL; -+ blk_finish_plug(plug); /* clears current->plug */ -+ current->bio_list = blistTmp; -+ blk_start_plug(plug); /* sets current->plug */ -+ } -+} -+ -+#if LINUX_VERSION_CODE >= 0x30200 -+static void loop_make_request_err(struct request_queue *q, struct bio *old_bio) -+#else -+static int loop_make_request_err(struct request_queue *q, struct bio *old_bio) -+#endif -+{ -+ old_bio->bi_next = NULL; -+ bio_io_error(old_bio); -+#if LINUX_VERSION_CODE >= 0x30200 -+ return; -+#else -+ return 0; -+#endif -+} -+ -+#if LINUX_VERSION_CODE >= 0x30200 -+static void loop_make_request_real(struct request_queue *q, struct bio *old_bio) -+#else -+static int loop_make_request_real(struct request_queue *q, struct bio *old_bio) -+#endif -+{ -+ struct bio *new_bio, *merge; -+ struct loop_device *lo = q->queuedata; -+ struct loop_bio_extension *extension; -+ int rw = bio_rw(old_bio), y, x, flushFlag = 0; -+ char *md; -+ wait_queue_t waitq; -+ -+ set_current_state(TASK_RUNNING); -+ if (!lo) -+ goto out; -+ if ((rw == WRITE) && (lo->lo_flags & LO_FLAGS_READ_ONLY)) -+ goto out; -+ atomic_inc(&lo->lo_pending); -+ -+ /* -+ * file backed, queue for loop_thread to handle -+ */ -+ if (lo->lo_flags & LO_FLAGS_DO_BMAP) { -+ loop_add_queue_last(lo, old_bio, &lo->lo_bio_que0); -+#if LINUX_VERSION_CODE >= 0x30200 -+ return; -+#else -+ return 0; -+#endif -+ } -+ -+ /* -+ * device backed, just remap bdev & sector for NONE transfer -+ */ -+ if (!lo->lo_encryption) { -+ old_bio->bi_sector += lo->lo_offs_sec; -+ old_bio->bi_bdev = lo->lo_device; -+ generic_make_request(old_bio); -+ if (atomic_dec_and_test(&lo->lo_pending)) -+ wake_up_interruptible_all(&lo->lo_bio_wait); -+#if LINUX_VERSION_CODE >= 0x30200 -+ return; -+#else -+ return 0; -+#endif -+ } -+ -+ /* -+ * device backed, start reads and writes now if buffer available -+ */ -+ merge = NULL; -+ init_waitqueue_entry(&waitq, current); -+ try_next_old_bio_vec: -+ new_bio = loop_get_buffer(lo, old_bio, &merge, &flushFlag); -+ if (!new_bio) { -+ /* wait for buffer to be freed, and try again */ -+ spin_lock_irq(&lo->lo_lock); -+ lo->lo_bio_flshCnt = 0; -+ spin_unlock_irq(&lo->lo_lock); -+ loop_unplug_backingdev(lo->lo_backingQueue); -+ add_wait_queue(&lo->lo_buf_wait, &waitq); -+ for (;;) { -+ set_current_state(TASK_UNINTERRUPTIBLE); -+ x = 0; -+ spin_lock_irq(&lo->lo_lock); -+ if (!merge && lo->lo_bio_free1) { -+ /* don't sleep if merge bio is available */ -+ x = 1; -+ } -+ if (merge && lo->lo_bio_free0) { -+ /* don't sleep if buffer bio is available */ -+ x = 1; -+ } -+ spin_unlock_irq(&lo->lo_lock); -+ if (x) -+ break; -+ schedule(); -+ } -+ set_current_state(TASK_RUNNING); -+ remove_wait_queue(&lo->lo_buf_wait, &waitq); -+ goto try_next_old_bio_vec; -+ } -+ if ((rw == WRITE) && old_bio->bi_io_vec && old_bio->bi_vcnt) { -+ extension = new_bio->bi_private; -+ y = extension->bioext_index; -+ md = kmap(old_bio->bi_io_vec[y].bv_page) + old_bio->bi_io_vec[y].bv_offset; -+ if (lo_do_transfer(lo, WRITE, page_address(new_bio->bi_io_vec[0].bv_page), md, extension->bioext_size, extension->bioext_iv)) { -+ clear_bit(0, &merge->bi_flags); -+ } -+ kunmap(old_bio->bi_io_vec[y].bv_page); -+ } -+ -+ /* merge & old_bio may vanish during generic_make_request() */ -+ /* if last vec gets processed before function returns */ -+ y = (merge->bi_idx < old_bio->bi_vcnt) ? 1 : 0; -+ -+ x = 0; -+ spin_lock_irq(&lo->lo_lock); -+ if((++lo->lo_bio_flshCnt >= lo->lo_bio_flshMax) || flushFlag) { -+ x = 1; -+ lo->lo_bio_flshCnt = 0; -+ new_bio->bi_rw |= L_BIO_RW_NOIDLE; -+ } -+ spin_unlock_irq(&lo->lo_lock); -+ -+ /* A thread may sleep and wait for new buffers from previously submitted requests. */ -+ /* Make sure requests are actually sent to backing device, and not just queued. */ -+ { -+ struct bio_list *blistTmp = current->bio_list; -+ current->bio_list = NULL; -+ generic_make_request(new_bio); -+ current->bio_list = blistTmp; -+ } -+ -+ if (x) -+ loop_unplug_backingdev(lo->lo_backingQueue); -+ -+ /* other vecs may need processing too */ -+ if (y) -+ goto try_next_old_bio_vec; -+#if LINUX_VERSION_CODE >= 0x30200 -+ return; -+#else -+ return 0; -+#endif -+ -+out: -+ old_bio->bi_next = NULL; -+ bio_io_error(old_bio); -+#if LINUX_VERSION_CODE >= 0x30200 -+ return; -+#else -+ return 0; -+#endif -+} -+ -+struct loop_switch_request { -+ struct file *file; -+ struct completion wait; -+}; -+ -+static void do_loop_switch(struct loop_device *lo, struct loop_switch_request *p) -+{ -+ struct file *file = p->file; -+ struct file *old_file=lo->lo_backing_file; -+ struct address_space *mapping = file->f_path.dentry->d_inode->i_mapping; -+ -+ /* This code runs on file backed loop only */ -+ /* no need to worry about -1 old_gfp_mask */ -+ mapping_set_gfp_mask(old_file->f_path.dentry->d_inode->i_mapping, lo->old_gfp_mask); -+ lo->lo_backing_file = file; -+ memset(lo->lo_file_name, 0, LO_NAME_SIZE); -+ lo->old_gfp_mask = mapping_gfp_mask(mapping); -+ mapping_set_gfp_mask(mapping, (lo->old_gfp_mask & ~(__GFP_IO|__GFP_FS)) | __GFP_HIGH); -+ complete(&p->wait); -+} -+ -+/* -+ * worker thread that handles reads/writes to file backed loop devices, -+ * to avoid blocking in our make_request_fn. it also does loop decrypting -+ * on reads for block backed loop, as that is too heavy to do from -+ * b_end_io context where irqs may be disabled. -+ */ -+static int loop_thread(void *data) -+{ -+ struct loop_device *lo = data; -+ struct bio *bio, *xbio, *merge; -+ struct loop_bio_extension *extension; -+ int x = 0, y; -+ wait_queue_t waitq; -+ char *md; -+ static const struct rlimit loop_rlim_defaults[RLIM_NLIMITS] = INIT_RLIMITS; -+#ifdef CONFIG_BLK_DEV_LOOP_KEYSCRUB -+ void (*keyscrubFn)(void *) = 0; -+#endif -+ -+ init_waitqueue_entry(&waitq, current); -+ memcpy(¤t->signal->rlim[0], &loop_rlim_defaults[0], sizeof(current->signal->rlim)); -+ -+ /* -+ * loop can be used in an encrypted device, -+ * hence, it mustn't be stopped at all -+ * because it could be indirectly used during suspension -+ */ -+ current->flags |= PF_NOFREEZE; -+ current->flags |= PF_LESS_THROTTLE; -+ -+ if (lo_nice > 0) -+ lo_nice = 0; -+ if (lo_nice < -20) -+ lo_nice = -20; -+ set_user_nice(current, lo_nice); -+ -+ atomic_inc(&lo->lo_pending); -+ -+ /* -+ * up sem, we are running -+ */ -+ complete(&lo->lo_done); -+ -+ for (;;) { -+ add_wait_queue(&lo->lo_bio_wait, &waitq); -+ for (;;) { -+ set_current_state(TASK_INTERRUPTIBLE); -+ if (!atomic_read(&lo->lo_pending)) -+ break; -+ -+ x = 0; -+ spin_lock_irq(&lo->lo_lock); -+#ifdef CONFIG_BLK_DEV_LOOP_KEYSCRUB -+ if((keyscrubFn = lo->lo_keyscrub_fn) != 0) { -+ lo->lo_keyscrub_fn = 0; -+ x = 1; -+ } -+#endif -+ if (lo->lo_bio_que0) { -+ /* don't sleep if device backed READ needs processing */ -+ /* don't sleep if file backed READ/WRITE needs processing */ -+ x = 1; -+ } -+ spin_unlock_irq(&lo->lo_lock); -+ if (x) -+ break; -+ -+ schedule(); -+ } -+ set_current_state(TASK_RUNNING); -+ remove_wait_queue(&lo->lo_bio_wait, &waitq); -+ -+#ifdef CONFIG_BLK_DEV_LOOP_KEYSCRUB -+ if(keyscrubFn) { -+ (*keyscrubFn)(lo->lo_keyscrub_ptr); -+ keyscrubFn = 0; -+ } -+#endif -+ /* -+ * could be woken because of tear-down, not because of -+ * pending work -+ */ -+ if (!atomic_read(&lo->lo_pending)) -+ break; -+ -+ bio = loop_get_bio(lo); -+ if (!bio) -+ continue; -+ -+ if (lo->lo_flags & LO_FLAGS_DO_BMAP) { -+ /* request is for file backed device */ -+ if(unlikely(!bio->bi_bdev)) { -+ do_loop_switch(lo, bio->bi_private); -+ bio->bi_next = NULL; -+ bio_put(bio); -+ } else { -+ y = do_bio_filebacked(lo, bio); -+ bio->bi_next = NULL; -+ bio_endio(bio, y); -+ } -+ } else { -+ /* device backed read has completed, do decrypt now */ -+ extension = bio->bi_private; -+ merge = extension->bioext_merge; -+ y = extension->bioext_index; -+ xbio = merge->bi_private; -+ if(xbio->bi_io_vec && xbio->bi_vcnt) { -+ md = kmap(xbio->bi_io_vec[y].bv_page) + xbio->bi_io_vec[y].bv_offset; -+ if (lo_do_transfer(lo, READ, page_address(bio->bi_io_vec[0].bv_page), md, extension->bioext_size, extension->bioext_iv)) { -+ clear_bit(0, &merge->bi_flags); -+ } -+ flush_dcache_page(xbio->bi_io_vec[y].bv_page); -+ kunmap(xbio->bi_io_vec[y].bv_page); -+ } -+ loop_put_buffer(lo, bio, 0); -+ if (!atomic_dec_and_test(&merge->bi_cnt)) -+ continue; -+ xbio->bi_next = NULL; -+ bio_endio(xbio, test_bit(0, &merge->bi_flags) ? (int)merge->bi_size : -EIO); -+ loop_put_buffer(lo, merge, 1); -+ } -+ -+ /* -+ * woken both for pending work and tear-down, lo_pending -+ * will hit zero then -+ */ -+ if (atomic_dec_and_test(&lo->lo_pending)) -+ break; -+ } -+ -+ complete(&lo->lo_done); -+ return 0; -+} -+ -+static void loop_set_softblksz(struct loop_device *lo, struct block_device *bdev) -+{ -+ int bs, x; -+ -+ if (lo->lo_device) -+ bs = block_size(lo->lo_device); -+ else -+ bs = PAGE_SIZE; -+ if (lo->lo_flags & LO_FLAGS_DO_BMAP) { -+ x = (int) bdev->bd_inode->i_size; -+ if ((bs == 8192) && (x & 0x1E00)) -+ bs = 4096; -+ if ((bs == 4096) && (x & 0x0E00)) -+ bs = 2048; -+ if ((bs == 2048) && (x & 0x0600)) -+ bs = 1024; -+ if ((bs == 1024) && (x & 0x0200)) -+ bs = 512; -+ } -+ set_blocksize(bdev, bs); -+} -+ -+/* -+ * loop_change_fd switches the backing store of a loopback device to a -+ * new file. This is useful for operating system installers to free up the -+ * original file and in High Availability environments to switch to an -+ * alternative location for the content in case of server meltdown. -+ * This can only work if the loop device is used read-only, file backed, -+ * and if the new backing store is the same size and type as the old -+ * backing store. -+ */ -+static int loop_change_fd(struct loop_device *lo, unsigned int arg) -+{ -+ struct file *file, *old_file; -+ struct inode *inode; -+ struct loop_switch_request w; -+ struct bio *bio; -+ int error; -+ -+ error = -EINVAL; -+ /* loop must be read-only */ -+ if (!(lo->lo_flags & LO_FLAGS_READ_ONLY)) -+ goto out; -+ -+ /* loop must be file backed */ -+ if (!(lo->lo_flags & LO_FLAGS_DO_BMAP)) -+ goto out; -+ -+ error = -EBADF; -+ file = fget(arg); -+ if (!file) -+ goto out; -+ -+ inode = file->f_path.dentry->d_inode; -+ old_file = lo->lo_backing_file; -+ -+ error = -EINVAL; -+ /* new backing store must be file backed */ -+ if (!S_ISREG(inode->i_mode)) -+ goto out_putf; -+ -+ /* new backing store must support reads */ -+ if (!file->f_op || !file->f_op->read) -+ goto out_putf; -+ -+ /* new backing store must be same size as the old one */ -+ if(i_size_read(inode) != i_size_read(old_file->f_path.dentry->d_inode)) -+ goto out_putf; -+ -+ /* loop must be in properly initialized state */ -+ if(lo->lo_queue->make_request_fn != loop_make_request_real) -+ goto out_putf; -+ -+ error = -ENOMEM; -+ bio = bio_alloc(GFP_KERNEL, 1); -+ if (!bio) -+ goto out_putf; -+ -+ /* wait for loop thread to do the switch */ -+ init_completion(&w.wait); -+ w.file = file; -+ bio->bi_private = &w; -+ bio->bi_bdev = NULL; -+ bio->bi_rw = 0; -+ loop_make_request_real(lo->lo_queue, bio); -+ wait_for_completion(&w.wait); -+ -+ fput(old_file); -+ return 0; -+ -+out_putf: -+ fput(file); -+out: -+ return error; -+} -+ -+static int loop_get_threads_count(struct loop_device *lo) -+{ -+ int x, y; -+ -+ if (lo->lo_flags & LO_FLAGS_DO_BMAP) { -+ /* file backed has only 1 pre-allocated page, so limit to 1 helper thread */ -+ return 1; -+ } -+ -+ y = lo_threads[0]; -+ for (x = 1; x < (sizeof(lo_threads) / sizeof(int)); x += 2) { -+ if (lo_threads[x + 1] && (lo->lo_number == lo_threads[x])) { -+ y = lo_threads[x + 1]; -+ break; -+ } -+ } -+ return y; -+} -+ -+#if defined(LOOP_HAVE_CONGESTED_FN) -+static int loop_congested(void *data, int bits) -+{ -+ struct loop_device *lo = data; -+ struct bio *bio; -+ int ret = 0; -+ unsigned long flags; -+ const int cong = (1 << BDI_sync_congested) | (1 << BDI_async_congested); -+ -+ if(lo && lo->lo_backingQueue) { -+ /* check if backing device is congested */ -+ ret |= bdi_congested(&lo->lo_backingQueue->backing_dev_info, bits); -+ /* check if loop device is low on resources */ -+ spin_lock_irqsave(&lo->lo_lock, flags); -+ bio = lo->lo_bio_free0; -+ if(!bio || !bio->bi_next) { -+ ret |= cong; -+ } -+ bio = lo->lo_bio_free1; -+ if(!bio || !bio->bi_next) { -+ ret |= cong; -+ } -+ spin_unlock_irqrestore(&lo->lo_lock, flags); -+ } -+ return (ret & bits); -+} -+#endif -+ -+static int loop_set_fd(struct loop_device *lo, unsigned int ldom, -+ struct block_device *bdev, unsigned int arg) -+{ -+ struct file *file; -+ struct inode *inode; -+ struct block_device *lo_device = NULL; -+ int lo_flags = 0; -+ int error; -+ int x, y; -+ struct task_struct *t[LO_THREADS_MAX]; -+ -+ error = -EBADF; -+ file = fget(arg); -+ if (!file) -+ goto out; -+ -+ error = -EINVAL; -+ inode = file->f_path.dentry->d_inode; -+ -+ if (!(file->f_mode & FMODE_WRITE)) -+ lo_flags |= LO_FLAGS_READ_ONLY; -+ -+ init_completion(&lo->lo_done); -+ spin_lock_init(&lo->lo_lock); -+ init_waitqueue_head(&lo->lo_bio_wait); -+ init_waitqueue_head(&lo->lo_buf_wait); -+ atomic_set(&lo->lo_pending, 0); -+#ifdef CONFIG_BLK_DEV_LOOP_KEYSCRUB -+ lo->lo_keyscrub_fn = 0; -+#endif -+ lo->lo_offset = lo->lo_sizelimit = 0; -+ lo->lo_offs_sec = lo->lo_iv_remove = 0; -+ lo->lo_encryption = NULL; -+ lo->lo_encrypt_key_size = 0; -+ lo->transfer = NULL; -+ lo->lo_crypt_name[0] = 0; -+ lo->lo_file_name[0] = 0; -+ lo->lo_init[1] = lo->lo_init[0] = 0; -+#if LINUX_VERSION_CODE >= 0x30600 -+ lo->lo_key_owner = GLOBAL_ROOT_UID; -+#else -+ lo->lo_key_owner = 0; -+#endif -+ lo->ioctl = NULL; -+ lo->key_data = NULL; -+ lo->lo_bio_que0 = NULL; -+ lo->lo_bio_free1 = lo->lo_bio_free0 = NULL; -+ lo->lo_bio_flshMax = lo->lo_bio_flshCnt = 0; -+ -+ if (S_ISBLK(inode->i_mode)) { -+ lo_device = inode->i_bdev; -+ if (lo_device == bdev) { -+ error = -EBUSY; -+ goto out_putf; -+ } -+ if (loop_prealloc_init(lo, 0)) { -+ error = -ENOMEM; -+ goto out_putf; -+ } -+ if (bdev_read_only(lo_device)) -+ lo_flags |= LO_FLAGS_READ_ONLY; -+ else -+ filemap_fdatawrite(inode->i_mapping); -+ } else if (S_ISREG(inode->i_mode)) { -+ /* -+ * If we can't read - sorry. If we only can't write - well, -+ * it's going to be read-only. -+ */ -+ if (!file->f_op || !file->f_op->read) -+ goto out_putf; -+ -+ if (!file->f_op->write) -+ lo_flags |= LO_FLAGS_READ_ONLY; -+ -+ lo_flags |= LO_FLAGS_DO_BMAP; -+ if (loop_prealloc_init(lo, 1)) { -+ error = -ENOMEM; -+ goto out_putf; -+ } -+ } else -+ goto out_putf; -+ -+ get_file(file); -+ -+ if (!(ldom & FMODE_WRITE)) -+ lo_flags |= LO_FLAGS_READ_ONLY; -+ -+ set_device_ro(bdev, (lo_flags & LO_FLAGS_READ_ONLY) != 0); -+ -+ lo->lo_device = lo_device; -+ lo->lo_flags = lo_flags; -+ if(lo_flags & LO_FLAGS_READ_ONLY) -+ lo->lo_flags |= 0x200000; /* export to user space */ -+ lo->lo_backing_file = file; -+ if (figure_loop_size(lo, bdev)) { -+ error = -EFBIG; -+ goto out_cleanup; -+ } -+ -+ /* -+ * set queue make_request_fn, and add limits based on lower level -+ * device -+ */ -+ blk_queue_make_request(lo->lo_queue, loop_make_request_err); -+ blk_queue_bounce_limit(lo->lo_queue, BLK_BOUNCE_ANY); -+ blk_queue_max_segment_size(lo->lo_queue, PAGE_CACHE_SIZE); -+ blk_queue_segment_boundary(lo->lo_queue, PAGE_CACHE_SIZE - 1); -+ blk_queue_max_segments(lo->lo_queue, BLK_MAX_SEGMENTS); -+ blk_queue_max_hw_sectors(lo->lo_queue, BLK_DEF_MAX_SECTORS); -+ lo->lo_queue->limits.cluster = 0; -+ blk_queue_flush(lo->lo_queue, 0); -+ lo->lo_backingQueue = 0; -+ -+ /* -+ * we remap to a block device, make sure we correctly stack limits -+ */ -+ if (S_ISBLK(inode->i_mode) && lo_device) { -+ struct request_queue *q = bdev_get_queue(lo_device); -+ -+ blk_queue_logical_block_size(lo->lo_queue, queue_logical_block_size(q)); -+ blk_queue_flush(lo->lo_queue, q->flush_flags & (REQ_FLUSH | REQ_FUA)); -+ lo->lo_queue->limits.io_min = q->limits.io_min; -+ if(lo->lo_queue->limits.io_min > (BLK_MAX_SEGMENTS * PAGE_CACHE_SIZE)) -+ lo->lo_queue->limits.io_min = (BLK_MAX_SEGMENTS * PAGE_CACHE_SIZE); -+ lo->lo_queue->limits.io_opt = q->limits.io_opt; -+ if(lo->lo_queue->limits.io_opt > (BLK_MAX_SEGMENTS * PAGE_CACHE_SIZE)) -+ lo->lo_queue->limits.io_opt = (BLK_MAX_SEGMENTS * PAGE_CACHE_SIZE); -+ lo->lo_backingQueue = q; -+ } -+ -+ if (lo_flags & LO_FLAGS_DO_BMAP) { -+ lo->old_gfp_mask = mapping_gfp_mask(inode->i_mapping); -+ mapping_set_gfp_mask(inode->i_mapping, (lo->old_gfp_mask & ~(__GFP_IO|__GFP_FS)) | __GFP_HIGH); -+ } else { -+ lo->old_gfp_mask = -1; -+ } -+ -+ loop_set_softblksz(lo, bdev); -+ -+ y = loop_get_threads_count(lo); -+ for(x = 0; x < y; x++) { -+ if(y > 1) { -+ t[x] = kthread_create(loop_thread, lo, "loop%d%c", lo->lo_number, x + 'a'); -+ } else { -+ t[x] = kthread_create(loop_thread, lo, "loop%d", lo->lo_number); -+ } -+ if (IS_ERR(t[x])) { -+ error = PTR_ERR(t[x]); -+ while(--x >= 0) { -+ kthread_stop(t[x]); -+ } -+ goto out_mapping; -+ } -+ } -+ for(x = 0; x < y; x++) { -+ wake_up_process(t[x]); -+ wait_for_completion(&lo->lo_done); -+ } -+ -+ fput(file); -+#if defined(LOOP_HAVE_CONGESTED_FN) -+ lo->lo_queue->backing_dev_info.congested_data = lo; -+ lo->lo_queue->backing_dev_info.congested_fn = loop_congested; -+#endif -+ wmb(); -+ lo->lo_queue->queuedata = lo; -+ __module_get(THIS_MODULE); -+ return 0; -+ -+ out_mapping: -+ if(lo->old_gfp_mask != -1) -+ mapping_set_gfp_mask(inode->i_mapping, lo->old_gfp_mask); -+ out_cleanup: -+ loop_prealloc_cleanup(lo); -+ fput(file); -+ out_putf: -+ fput(file); -+ out: -+ return error; -+} -+ -+static int loop_release_xfer(struct loop_device *lo) -+{ -+ int err = 0; -+ struct loop_func_table *xfer = lo->lo_encryption; -+ -+ if (xfer) { -+ lo->transfer = NULL; -+ if (xfer->release) -+ err = xfer->release(lo); -+ lo->lo_encryption = NULL; -+ module_put(xfer->owner); -+ } -+ return err; -+} -+ -+static int loop_init_xfer(struct loop_device *lo, struct loop_func_table *xfer, struct loop_info64 *i) -+{ -+ int err = 0; -+ -+ if (xfer) { -+ struct module *owner = xfer->owner; -+ -+ if(!try_module_get(owner)) -+ return -EINVAL; -+ if (xfer->init) -+ err = xfer->init(lo, i); -+ if (err) -+ module_put(owner); -+ else -+ lo->lo_encryption = xfer; -+ } -+ return err; -+} -+ -+static int loop_clr_fd(struct loop_device *lo, struct block_device *bdev) -+{ -+ struct file *filp = lo->lo_backing_file; -+ int gfp = lo->old_gfp_mask; -+ int bdocnt, x, y; -+ -+ /* sync /dev/loop? device */ -+ sync_blockdev(bdev); -+ /* sync backing /dev/hda? device */ -+ sync_blockdev(lo->lo_device); -+ -+ for(x = 0; x < 20; x++) { -+ spin_lock(&lo->lo_ioctl_spin); -+ bdocnt = lo->lo_refcnt; -+ spin_unlock(&lo->lo_ioctl_spin); -+ if(bdocnt == 1) break; -+ /* work around reference count race */ -+ msleep(50); -+ } -+ -+ if (bdocnt != 1) /* one for this fd being open */ -+ return -EBUSY; -+ if (filp==NULL) -+ return -EINVAL; -+ -+ lo->lo_queue->queuedata = NULL; -+ lo->lo_queue->make_request_fn = loop_make_request_err; -+ lo->lo_backingQueue = 0; -+ y = loop_get_threads_count(lo); -+ for(x = 0; x < y; x++) { -+ if (atomic_dec_and_test(&lo->lo_pending)) -+ wake_up_interruptible_all(&lo->lo_bio_wait); -+ } -+ for(x = 0; x < y; x++) { -+ wait_for_completion(&lo->lo_done); -+ } -+ blk_queue_flush(lo->lo_queue, 0); -+ loop_prealloc_cleanup(lo); -+ lo->lo_backing_file = NULL; -+ loop_release_xfer(lo); -+ lo->transfer = NULL; -+ lo->ioctl = NULL; -+ lo->lo_device = NULL; -+ lo->lo_encryption = NULL; -+#ifdef CONFIG_BLK_DEV_LOOP_KEYSCRUB -+ lo->lo_keyscrub_fn = 0; -+#endif -+ lo->lo_offset = lo->lo_sizelimit = 0; -+ lo->lo_offs_sec = lo->lo_iv_remove = 0; -+ lo->lo_encrypt_key_size = 0; -+ lo->lo_flags = 0; -+ lo->lo_init[1] = lo->lo_init[0] = 0; -+#if LINUX_VERSION_CODE >= 0x30600 -+ lo->lo_key_owner = GLOBAL_ROOT_UID; -+#else -+ lo->lo_key_owner = 0; -+#endif -+ lo->key_data = NULL; -+ memset(lo->lo_encrypt_key, 0, LO_KEY_SIZE); -+ memset(lo->lo_crypt_name, 0, LO_NAME_SIZE); -+ memset(lo->lo_file_name, 0, LO_NAME_SIZE); -+ invalidate_bdev(bdev); -+ set_capacity(disks[lo->lo_number], 0); -+ if (gfp != -1) -+ mapping_set_gfp_mask(filp->f_path.dentry->d_inode->i_mapping, gfp); -+ fput(filp); -+ module_put(THIS_MODULE); -+ return 0; -+} -+ -+static int loop_set_status(struct loop_device *lo, struct block_device *bdev, struct loop_info64 *info) -+{ -+ int err; -+ struct loop_func_table *xfer = NULL; -+#if LINUX_VERSION_CODE >= 0x30600 -+ kuid_t uid = current_uid(); -+ -+ if (lo->lo_encrypt_key_size && !uid_eq(lo->lo_key_owner, uid) && !capable(CAP_SYS_ADMIN)) -+ return -EPERM; -+#else -+ uid_t uid = current_uid(); -+ -+ if (lo->lo_encrypt_key_size && lo->lo_key_owner != uid && !capable(CAP_SYS_ADMIN)) -+ return -EPERM; -+#endif -+ if ((unsigned int) info->lo_encrypt_key_size > LO_KEY_SIZE) -+ return -EINVAL; -+ -+ err = loop_release_xfer(lo); -+ if (err) -+ return err; -+ -+ if ((loff_t)info->lo_offset < 0) { -+ /* negative offset == remove offset from IV computations */ -+ lo->lo_offset = -(info->lo_offset); -+ lo->lo_iv_remove = lo->lo_offset >> 9; -+ } else { -+ /* positive offset == include offset in IV computations */ -+ lo->lo_offset = info->lo_offset; -+ lo->lo_iv_remove = 0; -+ } -+ lo->lo_offs_sec = lo->lo_offset >> 9; -+ lo->lo_sizelimit = info->lo_sizelimit; -+ err = figure_loop_size(lo, bdev); -+ if (err) -+ return err; -+ loop_set_softblksz(lo, bdev); -+ -+ if (info->lo_encrypt_type) { -+ unsigned int type = info->lo_encrypt_type; -+ -+ if (type >= MAX_LO_CRYPT) -+ return -EINVAL; -+ xfer = xfer_funcs[type]; -+ if (xfer == NULL) -+ return -EINVAL; -+ } else if(!(lo->lo_flags & LO_FLAGS_DO_BMAP)) { -+ blk_queue_max_hw_sectors(lo->lo_queue, PAGE_CACHE_SIZE >> 9); -+ } -+ err = loop_init_xfer(lo, xfer, info); -+ if (err) -+ return err; -+ -+ if (!xfer) -+ xfer = &none_funcs; -+ lo->transfer = xfer->transfer; -+ lo->ioctl = xfer->ioctl; -+ -+ memcpy(lo->lo_file_name, info->lo_file_name, LO_NAME_SIZE); -+ memcpy(lo->lo_crypt_name, info->lo_crypt_name, LO_NAME_SIZE); -+ lo->lo_file_name[LO_NAME_SIZE-1] = 0; -+ lo->lo_crypt_name[LO_NAME_SIZE-1] = 0; -+ lo->lo_encrypt_key_size = info->lo_encrypt_key_size; -+ lo->lo_init[0] = info->lo_init[0]; -+ lo->lo_init[1] = info->lo_init[1]; -+ if (info->lo_encrypt_key_size) { -+ memcpy(lo->lo_encrypt_key, info->lo_encrypt_key, -+ info->lo_encrypt_key_size); -+ lo->lo_key_owner = uid; -+ } -+ -+ lo->lo_queue->make_request_fn = loop_make_request_real; -+ return 0; -+} -+ -+static int loop_get_status(struct loop_device *lo, struct loop_info64 *info) -+{ -+ struct file *file = lo->lo_backing_file; -+ struct kstat stat; -+ int error; -+ -+#if LINUX_VERSION_CODE >= 0x30900 -+ error = vfs_getattr(&file->f_path, &stat); -+#else -+ error = vfs_getattr(file->f_vfsmnt, file->f_path.dentry, &stat); -+#endif -+ if (error) -+ return error; -+ memset(info, 0, sizeof(*info)); -+ info->lo_number = lo->lo_number; -+ info->lo_device = huge_encode_dev(stat.dev); -+ info->lo_inode = stat.ino; -+ info->lo_rdevice = huge_encode_dev(lo->lo_device ? stat.rdev : stat.dev); -+ info->lo_offset = lo->lo_iv_remove ? -(lo->lo_offset) : lo->lo_offset; -+ info->lo_sizelimit = lo->lo_sizelimit; -+ info->lo_flags = lo->lo_flags; -+ memcpy(info->lo_file_name, lo->lo_file_name, LO_NAME_SIZE); -+ memcpy(info->lo_crypt_name, lo->lo_crypt_name, LO_NAME_SIZE); -+ info->lo_encrypt_type = lo->lo_encryption ? lo->lo_encryption->number : 0; -+ if (lo->lo_encrypt_key_size && capable(CAP_SYS_ADMIN)) { -+ info->lo_encrypt_key_size = lo->lo_encrypt_key_size; -+ memcpy(info->lo_encrypt_key, lo->lo_encrypt_key, -+ lo->lo_encrypt_key_size); -+ info->lo_init[0] = lo->lo_init[0]; -+ info->lo_init[1] = lo->lo_init[1]; -+ } -+ return 0; -+} -+ -+static void -+loop_info64_from_old(const struct loop_info *info, struct loop_info64 *info64) -+{ -+ memset(info64, 0, sizeof(*info64)); -+ info64->lo_number = info->lo_number; -+ info64->lo_device = info->lo_device; -+ info64->lo_inode = info->lo_inode; -+ info64->lo_rdevice = info->lo_rdevice; -+ info64->lo_offset = info->lo_offset; -+ info64->lo_encrypt_type = info->lo_encrypt_type; -+ info64->lo_encrypt_key_size = info->lo_encrypt_key_size; -+ info64->lo_flags = info->lo_flags; -+ info64->lo_init[0] = info->lo_init[0]; -+ info64->lo_init[1] = info->lo_init[1]; -+ if (info->lo_encrypt_type == LO_CRYPT_CRYPTOAPI) -+ memcpy(info64->lo_crypt_name, info->lo_name, LO_NAME_SIZE); -+ else -+ memcpy(info64->lo_file_name, info->lo_name, LO_NAME_SIZE); -+ memcpy(info64->lo_encrypt_key, info->lo_encrypt_key, LO_KEY_SIZE); -+} -+ -+static int -+loop_info64_to_old(struct loop_info64 *info64, struct loop_info *info) -+{ -+ memset(info, 0, sizeof(*info)); -+ info->lo_number = info64->lo_number; -+ info->lo_device = info64->lo_device; -+ info->lo_inode = info64->lo_inode; -+ info->lo_rdevice = info64->lo_rdevice; -+ info->lo_offset = info64->lo_offset; -+ info->lo_encrypt_type = info64->lo_encrypt_type; -+ info->lo_encrypt_key_size = info64->lo_encrypt_key_size; -+ info->lo_flags = info64->lo_flags; -+ info->lo_init[0] = info64->lo_init[0]; -+ info->lo_init[1] = info64->lo_init[1]; -+ if (info->lo_encrypt_type == LO_CRYPT_CRYPTOAPI) -+ memcpy(info->lo_name, info64->lo_crypt_name, LO_NAME_SIZE); -+ else -+ memcpy(info->lo_name, info64->lo_file_name, LO_NAME_SIZE); -+ memcpy(info->lo_encrypt_key, info64->lo_encrypt_key, LO_KEY_SIZE); -+ -+ /* error in case values were truncated */ -+ if (info->lo_device != info64->lo_device || -+ info->lo_rdevice != info64->lo_rdevice || -+ info->lo_inode != info64->lo_inode || -+ info->lo_offset != info64->lo_offset || -+ info64->lo_sizelimit) -+ return -EOVERFLOW; -+ -+ return 0; -+} -+ -+static int -+loop_set_status_old(struct loop_device *lo, struct block_device *bdev, const struct loop_info *arg) -+{ -+ struct loop_info info; -+ struct loop_info64 info64; -+ -+ if (copy_from_user(&info, arg, sizeof (struct loop_info))) -+ return -EFAULT; -+ loop_info64_from_old(&info, &info64); -+ memset(&info.lo_encrypt_key[0], 0, sizeof(info.lo_encrypt_key)); -+ return loop_set_status(lo, bdev, &info64); -+} -+ -+static int -+loop_set_status64(struct loop_device *lo, struct block_device *bdev, struct loop_info64 *arg) -+{ -+ struct loop_info64 info64; -+ -+ if (copy_from_user(&info64, arg, sizeof (struct loop_info64))) -+ return -EFAULT; -+ return loop_set_status(lo, bdev, &info64); -+} -+ -+static int -+loop_get_status_old(struct loop_device *lo, struct loop_info *arg) { -+ struct loop_info info; -+ struct loop_info64 info64; -+ int err = 0; -+ -+ if (!arg) -+ err = -EINVAL; -+ if (!err) -+ err = loop_get_status(lo, &info64); -+ if (!err) -+ err = loop_info64_to_old(&info64, &info); -+ if (!err && copy_to_user(arg, &info, sizeof(info))) -+ err = -EFAULT; -+ -+ return err; -+} -+ -+static int -+loop_get_status64(struct loop_device *lo, struct loop_info64 *arg) { -+ struct loop_info64 info64; -+ int err = 0; -+ -+ if (!arg) -+ err = -EINVAL; -+ if (!err) -+ err = loop_get_status(lo, &info64); -+ if (!err && copy_to_user(arg, &info64, sizeof(info64))) -+ err = -EFAULT; -+ -+ return err; -+} -+ -+static int lo_ioctl(struct block_device *bdev, fmode_t ldom, unsigned int cmd, unsigned long arg) -+{ -+ struct loop_device *lo = bdev->bd_disk->private_data; -+ int err; -+ wait_queue_t waitq; -+ -+ /* -+ * mutual exclusion - lock -+ */ -+ init_waitqueue_entry(&waitq, current); -+ add_wait_queue(&lo->lo_ioctl_wait, &waitq); -+ for (;;) { -+ set_current_state(TASK_UNINTERRUPTIBLE); -+ spin_lock(&lo->lo_ioctl_spin); -+ err = lo->lo_ioctl_busy; -+ if(!err) lo->lo_ioctl_busy = 1; -+ spin_unlock(&lo->lo_ioctl_spin); -+ if(!err) break; -+ schedule(); -+ } -+ set_current_state(TASK_RUNNING); -+ remove_wait_queue(&lo->lo_ioctl_wait, &waitq); -+ -+ /* -+ * LOOP_SET_FD can only be called when no device is attached. -+ * All other ioctls can only be called when a device is attached. -+ */ -+ if (bdev->bd_disk->queue->queuedata != NULL) { -+ if (cmd == LOOP_SET_FD) { -+ err = -EBUSY; -+ goto out_err; -+ } -+ } else { -+ if (cmd != LOOP_SET_FD) { -+ err = -ENXIO; -+ goto out_err; -+ } -+ } -+ -+ switch (cmd) { -+ case LOOP_SET_FD: -+ err = loop_set_fd(lo, ldom, bdev, arg); -+ break; -+ case LOOP_CHANGE_FD: -+ err = loop_change_fd(lo, arg); -+ break; -+ case LOOP_CLR_FD: -+ err = loop_clr_fd(lo, bdev); -+ break; -+ case LOOP_SET_STATUS: -+ err = loop_set_status_old(lo, bdev, (struct loop_info *) arg); -+ break; -+ case LOOP_GET_STATUS: -+ err = loop_get_status_old(lo, (struct loop_info *) arg); -+ break; -+ case LOOP_SET_STATUS64: -+ err = loop_set_status64(lo, bdev, (struct loop_info64 *) arg); -+ break; -+ case LOOP_GET_STATUS64: -+ err = loop_get_status64(lo, (struct loop_info64 *) arg); -+ break; -+ case LOOP_RECOMPUTE_DEV_SIZE: -+ err = figure_loop_size(lo, bdev); -+ break; -+ default: -+ err = lo->ioctl ? lo->ioctl(lo, cmd, arg) : -EINVAL; -+ } -+out_err: -+ /* -+ * mutual exclusion - unlock -+ */ -+ spin_lock(&lo->lo_ioctl_spin); -+ lo->lo_ioctl_busy = 0; -+ spin_unlock(&lo->lo_ioctl_spin); -+ wake_up_all(&lo->lo_ioctl_wait); -+ -+ return err; -+} -+ -+#if defined(CONFIG_COMPAT) && defined(HAVE_COMPAT_IOCTL) -+struct loop_info32 { -+ compat_int_t lo_number; /* ioctl r/o */ -+ compat_dev_t lo_device; /* ioctl r/o */ -+ compat_ulong_t lo_inode; /* ioctl r/o */ -+ compat_dev_t lo_rdevice; /* ioctl r/o */ -+ compat_int_t lo_offset; -+ compat_int_t lo_encrypt_type; -+ compat_int_t lo_encrypt_key_size; /* ioctl w/o */ -+ compat_int_t lo_flags; /* ioctl r/o */ -+ char lo_name[LO_NAME_SIZE]; -+ unsigned char lo_encrypt_key[LO_KEY_SIZE]; /* ioctl w/o */ -+ compat_ulong_t lo_init[2]; -+ char reserved[4]; -+}; -+ -+static int lo_compat_ioctl(struct block_device *p1, fmode_t p2, unsigned int cmd, unsigned long arg) -+{ -+ mm_segment_t old_fs = get_fs(); -+ struct loop_info l; -+ struct loop_info32 *ul = (struct loop_info32 *)arg; -+ int err = -ENOIOCTLCMD; -+ -+ switch (cmd) { -+ case LOOP_SET_FD: -+ case LOOP_CLR_FD: -+ case LOOP_SET_STATUS64: -+ case LOOP_GET_STATUS64: -+ case LOOP_CHANGE_FD: -+ case LOOP_MULTI_KEY_SETUP: -+ case LOOP_MULTI_KEY_SETUP_V3: -+ case LOOP_RECOMPUTE_DEV_SIZE: -+ err = lo_ioctl(p1, p2, cmd, arg); -+ break; -+ case LOOP_SET_STATUS: -+ memset(&l, 0, sizeof(l)); -+ err = get_user(l.lo_number, &ul->lo_number); -+ err |= get_user(l.lo_device, &ul->lo_device); -+ err |= get_user(l.lo_inode, &ul->lo_inode); -+ err |= get_user(l.lo_rdevice, &ul->lo_rdevice); -+ err |= copy_from_user(&l.lo_offset, &ul->lo_offset, -+ 8 + (unsigned long)l.lo_init - (unsigned long)&l.lo_offset); -+ if (err) { -+ err = -EFAULT; -+ } else { -+ set_fs (KERNEL_DS); -+ err = lo_ioctl(p1, p2, cmd, (unsigned long)&l); -+ set_fs (old_fs); -+ } -+ memset(&l, 0, sizeof(l)); -+ break; -+ case LOOP_GET_STATUS: -+ set_fs (KERNEL_DS); -+ err = lo_ioctl(p1, p2, cmd, (unsigned long)&l); -+ set_fs (old_fs); -+ if (!err) { -+ err = put_user(l.lo_number, &ul->lo_number); -+ err |= put_user(l.lo_device, &ul->lo_device); -+ err |= put_user(l.lo_inode, &ul->lo_inode); -+ err |= put_user(l.lo_rdevice, &ul->lo_rdevice); -+ err |= copy_to_user(&ul->lo_offset, &l.lo_offset, -+ (unsigned long)l.lo_init - (unsigned long)&l.lo_offset); -+ if (err) -+ err = -EFAULT; -+ } -+ memset(&l, 0, sizeof(l)); -+ break; -+ -+ } -+ return err; -+} -+#endif -+ -+static int lo_open(struct block_device *bdev, fmode_t mode) -+{ -+ struct loop_device *lo = bdev->bd_disk->private_data; -+ -+ spin_lock(&lo->lo_ioctl_spin); -+ lo->lo_refcnt++; -+ spin_unlock(&lo->lo_ioctl_spin); -+ return 0; -+} -+ -+#if LINUX_VERSION_CODE >= 0x30a00 -+static void lo_release(struct gendisk *disk, fmode_t mode) -+#else -+static int lo_release(struct gendisk *disk, fmode_t mode) -+#endif -+{ -+ struct loop_device *lo = disk->private_data; -+ -+ spin_lock(&lo->lo_ioctl_spin); -+ lo->lo_refcnt--; -+ spin_unlock(&lo->lo_ioctl_spin); -+#if LINUX_VERSION_CODE < 0x30a00 -+ return 0; -+#endif -+} -+ -+static struct block_device_operations lo_fops = { -+ .owner = THIS_MODULE, -+ .open = lo_open, -+ .release = lo_release, -+ .ioctl = lo_ioctl, -+#if defined(CONFIG_COMPAT) && defined(HAVE_COMPAT_IOCTL) -+ .compat_ioctl = lo_compat_ioctl, -+#endif -+}; -+ -+/* -+ * And now the modules code and kernel interface. -+ */ -+MODULE_LICENSE("GPL"); -+MODULE_ALIAS_BLOCKDEV_MAJOR(LOOP_MAJOR); -+ -+int loop_register_transfer(struct loop_func_table *funcs) -+{ -+ unsigned int n = funcs->number; -+ -+ if (n >= MAX_LO_CRYPT || xfer_funcs[n]) -+ return -EINVAL; -+ xfer_funcs[n] = funcs; -+ return 0; -+} -+ -+int loop_unregister_transfer(int number) -+{ -+ unsigned int n = number; -+ struct loop_device *lo; -+ struct loop_func_table *xfer; -+ int x; -+ -+ if (n == 0 || n >= MAX_LO_CRYPT || (xfer = xfer_funcs[n]) == NULL) -+ return -EINVAL; -+ xfer_funcs[n] = NULL; -+ for (x = 0; x < max_loop; x++) { -+ lo = loop_dev_ptr_arr[x]; -+ if (!lo) -+ continue; -+ if (lo->lo_encryption == xfer) -+ loop_release_xfer(lo); -+ } -+ return 0; -+} -+ -+EXPORT_SYMBOL(loop_register_transfer); -+EXPORT_SYMBOL(loop_unregister_transfer); -+ -+int __init loop_init(void) -+{ -+ int i; -+ -+#ifdef CONFIG_BLK_DEV_LOOP_AES -+#if defined(CONFIG_BLK_DEV_LOOP_PADLOCK) && (defined(CONFIG_X86) || defined(CONFIG_X86_64)) -+ if((boot_cpu_data.x86 >= 6) && CentaurHauls_ID_and_enabled_ACE()) { -+ xfer_funcs[LO_CRYPT_AES] = &funcs_padlock_aes; -+ printk(KERN_INFO "loop: padlock hardware AES enabled\n"); -+ } else -+#endif -+#if defined(CONFIG_BLK_DEV_LOOP_INTELAES) && (defined(CONFIG_X86) || defined(CONFIG_X86_64)) -+ if((boot_cpu_data.x86 >= 6) && ((cpuid_ecx(1) & 0x02000000) == 0x02000000)) { -+ xfer_funcs[LO_CRYPT_AES] = &funcs_intel_aes; -+ printk("loop: Intel hardware AES enabled\n"); -+ } else -+#endif -+#endif -+ { } /* needed because of above else statements */ -+ -+ if ((max_loop < 1) || (max_loop > 256)) { -+ printk(KERN_WARNING "loop: invalid max_loop (must be between" -+ " 1 and 256), using default (8)\n"); -+ max_loop = 8; -+ } -+ -+ if (register_blkdev(LOOP_MAJOR, "loop")) -+ return -EIO; -+ -+ loop_dev_ptr_arr = kmalloc(max_loop * sizeof(struct loop_device *), GFP_KERNEL); -+ if (!loop_dev_ptr_arr) -+ goto out_mem1; -+ -+ disks = kmalloc(max_loop * sizeof(struct gendisk *), GFP_KERNEL); -+ if (!disks) -+ goto out_mem2; -+ -+ for (i = 0; i < max_loop; i++) { -+ loop_dev_ptr_arr[i] = kmalloc(sizeof(struct loop_device), GFP_KERNEL); -+ if (!loop_dev_ptr_arr[i]) -+ goto out_mem3; -+ } -+ -+ for (i = 0; i < max_loop; i++) { -+ disks[i] = alloc_disk(1); -+ if (!disks[i]) -+ goto out_mem4; -+ } -+ -+ for (i = 0; i < max_loop; i++) { -+ disks[i]->queue = blk_alloc_queue(GFP_KERNEL); -+ if (!disks[i]->queue) -+ goto out_mem5; -+ disks[i]->queue->queuedata = NULL; -+ blk_queue_make_request(disks[i]->queue, loop_make_request_err); -+ } -+ -+ for (i = 0; i < (sizeof(lo_prealloc) / sizeof(int)); i += 2) { -+ if (!lo_prealloc[i]) -+ continue; -+ if (lo_prealloc[i] < LO_PREALLOC_MIN) -+ lo_prealloc[i] = LO_PREALLOC_MIN; -+ if (lo_prealloc[i] > LO_PREALLOC_MAX) -+ lo_prealloc[i] = LO_PREALLOC_MAX; -+ } -+ for (i = 0; i < (sizeof(lo_threads) / sizeof(int)); i += 2) { -+ if (!lo_threads[i]) -+ continue; -+ if (lo_threads[i] < LO_THREADS_MIN) -+ lo_threads[i] = LO_THREADS_MIN; -+ if (lo_threads[i] > LO_THREADS_MAX) -+ lo_threads[i] = LO_THREADS_MAX; -+ } -+ -+#if defined(IOCTL32_COMPATIBLE_PTR) -+ register_ioctl32_conversion(LOOP_MULTI_KEY_SETUP, IOCTL32_COMPATIBLE_PTR); -+ register_ioctl32_conversion(LOOP_MULTI_KEY_SETUP_V3, IOCTL32_COMPATIBLE_PTR); -+ register_ioctl32_conversion(LOOP_RECOMPUTE_DEV_SIZE, IOCTL32_COMPATIBLE_PTR); -+#endif -+ -+#ifdef CONFIG_DEVFS_FS -+ devfs_mk_dir("loop"); -+#endif -+ -+ for (i = 0; i < max_loop; i++) { -+ struct loop_device *lo = loop_dev_ptr_arr[i]; -+ struct gendisk *disk = disks[i]; -+ memset(lo, 0, sizeof(struct loop_device)); -+ lo->lo_number = i; -+ lo->lo_queue = disk->queue; -+ spin_lock_init(&lo->lo_ioctl_spin); -+ init_waitqueue_head(&lo->lo_ioctl_wait); -+ disk->major = LOOP_MAJOR; -+ disk->first_minor = i; -+ disk->fops = &lo_fops; -+ sprintf(disk->disk_name, "loop%d", i); -+#ifdef CONFIG_DEVFS_FS -+ sprintf(disk->devfs_name, "loop/%d", i); -+#endif -+ disk->private_data = lo; -+ add_disk(disk); -+ } -+ -+#ifdef CONFIG_BLK_DEV_LOOP_AES -+#ifdef CONFIG_BLK_DEV_LOOP_KEYSCRUB -+ printk(KERN_INFO "loop: AES key scrubbing enabled\n"); -+#endif -+#endif -+ printk(KERN_INFO "loop: loaded (max %d devices)\n", max_loop); -+ return 0; -+ -+out_mem5: -+ while (i--) -+ blk_cleanup_queue(disks[i]->queue); -+ i = max_loop; -+out_mem4: -+ while (i--) -+ put_disk(disks[i]); -+ i = max_loop; -+out_mem3: -+ while (i--) -+ kfree(loop_dev_ptr_arr[i]); -+ kfree(disks); -+out_mem2: -+ kfree(loop_dev_ptr_arr); -+out_mem1: -+ unregister_blkdev(LOOP_MAJOR, "loop"); -+ printk(KERN_ERR "loop: ran out of memory\n"); -+ return -ENOMEM; -+} -+ -+void loop_exit(void) -+{ -+ int i; -+ -+ for (i = 0; i < max_loop; i++) { -+ del_gendisk(disks[i]); -+ put_disk(disks[i]); -+ blk_cleanup_queue(loop_dev_ptr_arr[i]->lo_queue); -+ kfree(loop_dev_ptr_arr[i]); -+ } -+#ifdef CONFIG_DEVFS_FS -+ devfs_remove("loop"); -+#endif -+ unregister_blkdev(LOOP_MAJOR, "loop"); -+ kfree(disks); -+ kfree(loop_dev_ptr_arr); -+ -+#if defined(IOCTL32_COMPATIBLE_PTR) -+ unregister_ioctl32_conversion(LOOP_MULTI_KEY_SETUP); -+ unregister_ioctl32_conversion(LOOP_MULTI_KEY_SETUP_V3); -+ unregister_ioctl32_conversion(LOOP_RECOMPUTE_DEV_SIZE); -+#endif -+} -+ -+module_init(loop_init); -+module_exit(loop_exit); -+ -+#ifdef CONFIG_BLK_DEV_LOOP_KEYSCRUB -+void loop_add_keyscrub_fn(struct loop_device *lo, void (*fn)(void *), void *ptr) -+{ -+ lo->lo_keyscrub_ptr = ptr; -+ wmb(); -+ lo->lo_keyscrub_fn = fn; -+ wake_up_interruptible(&lo->lo_bio_wait); -+} -+EXPORT_SYMBOL(loop_add_keyscrub_fn); -+#endif -diff -urN linux-3.10-noloop/drivers/misc/Makefile linux-3.10-AES/drivers/misc/Makefile ---- linux-3.10-noloop/drivers/misc/Makefile 2013-07-01 01:13:29.000000000 +0300 -+++ linux-3.10-AES/drivers/misc/Makefile 2013-07-01 16:12:48.000000000 +0300 -@@ -2,6 +2,33 @@ - # Makefile for misc devices that really don't fit anywhere else. - # - -+ifeq ($(CONFIG_BLK_DEV_LOOP_AES),y) -+AES_X86_ASM=n -+ifeq ($(CONFIG_X86),y) -+ifneq ($(CONFIG_X86_64),y) -+ AES_X86_ASM=y -+endif -+endif -+ifeq ($(AES_X86_ASM),y) -+ obj-y += aes-x86.o md5-x86.o crypto-ksym.o -+ AFLAGS_aes-x86.o := -DUSE_UNDERLINE=1 -+ifeq ($(CONFIG_BLK_DEV_LOOP_INTELAES),y) -+ obj-y += aes-intel32.o -+endif -+else -+ifeq ($(CONFIG_X86_64),y) -+ obj-y += aes-amd64.o md5-amd64.o md5-2x-amd64.o crypto-ksym.o -+ AFLAGS_aes-amd64.o := -DUSE_UNDERLINE=1 -+ifeq ($(CONFIG_BLK_DEV_LOOP_INTELAES),y) -+ obj-y += aes-intel64.o -+endif -+else -+ obj-y += aes.o md5.o crypto-ksym.o -+ CFLAGS_aes.o := -DDATA_ALWAYS_ALIGNED=1 -+endif -+endif -+endif -+ - obj-$(CONFIG_IBM_ASM) += ibmasm/ - obj-$(CONFIG_AD525X_DPOT) += ad525x_dpot.o - obj-$(CONFIG_AD525X_DPOT_I2C) += ad525x_dpot-i2c.o -diff -urN linux-3.10-noloop/drivers/misc/aes-amd64.S linux-3.10-AES/drivers/misc/aes-amd64.S ---- linux-3.10-noloop/drivers/misc/aes-amd64.S 1970-01-01 02:00:00.000000000 +0200 -+++ linux-3.10-AES/drivers/misc/aes-amd64.S 2013-07-01 16:12:48.000000000 +0300 -@@ -0,0 +1,897 @@ -+// -+// Copyright (c) 2001, Dr Brian Gladman , Worcester, UK. -+// All rights reserved. -+// -+// TERMS -+// -+// Redistribution and use in source and binary forms, with or without -+// modification, are permitted subject to the following conditions: -+// -+// 1. Redistributions of source code must retain the above copyright -+// notice, this list of conditions and the following disclaimer. -+// -+// 2. Redistributions in binary form must reproduce the above copyright -+// notice, this list of conditions and the following disclaimer in the -+// documentation and/or other materials provided with the distribution. -+// -+// 3. The copyright holder's name must not be used to endorse or promote -+// any products derived from this software without his specific prior -+// written permission. -+// -+// This software is provided 'as is' with no express or implied warranties -+// of correctness or fitness for purpose. -+ -+// Modified by Jari Ruusu, December 24 2001 -+// - Converted syntax to GNU CPP/assembler syntax -+// - C programming interface converted back to "old" API -+// - Minor portability cleanups and speed optimizations -+ -+// Modified by Jari Ruusu, April 11 2002 -+// - Added above copyright and terms to resulting object code so that -+// binary distributions can avoid legal trouble -+ -+// Modified by Jari Ruusu, June 12 2004 -+// - Converted 32 bit x86 code to 64 bit AMD64 code -+// - Re-wrote encrypt and decrypt code from scratch -+ -+// An AES (Rijndael) implementation for the AMD64. This version only -+// implements the standard AES block length (128 bits, 16 bytes). This code -+// does not preserve the rax, rcx, rdx, rsi, rdi or r8-r11 registers or the -+// artihmetic status flags. However, the rbx, rbp and r12-r15 registers are -+// preserved across calls. -+ -+// void aes_set_key(aes_context *cx, const unsigned char key[], const int key_len, const int f) -+// void aes_encrypt(const aes_context *cx, const unsigned char in_blk[], unsigned char out_blk[]) -+// void aes_decrypt(const aes_context *cx, const unsigned char in_blk[], unsigned char out_blk[]) -+ -+#if defined(USE_UNDERLINE) -+# define aes_set_key _aes_set_key -+# define aes_encrypt _aes_encrypt -+# define aes_decrypt _aes_decrypt -+#endif -+#if !defined(ALIGN64BYTES) -+# define ALIGN64BYTES 64 -+#endif -+ -+ .file "aes-amd64.S" -+ .globl aes_set_key -+ .globl aes_encrypt -+ .globl aes_decrypt -+ -+ .section .rodata -+copyright: -+ .ascii " \000" -+ .ascii "Copyright (c) 2001, Dr Brian Gladman , Worcester, UK.\000" -+ .ascii "All rights reserved.\000" -+ .ascii " \000" -+ .ascii "TERMS\000" -+ .ascii " \000" -+ .ascii " Redistribution and use in source and binary forms, with or without\000" -+ .ascii " modification, are permitted subject to the following conditions:\000" -+ .ascii " \000" -+ .ascii " 1. Redistributions of source code must retain the above copyright\000" -+ .ascii " notice, this list of conditions and the following disclaimer.\000" -+ .ascii " \000" -+ .ascii " 2. Redistributions in binary form must reproduce the above copyright\000" -+ .ascii " notice, this list of conditions and the following disclaimer in the\000" -+ .ascii " documentation and/or other materials provided with the distribution.\000" -+ .ascii " \000" -+ .ascii " 3. The copyright holder's name must not be used to endorse or promote\000" -+ .ascii " any products derived from this software without his specific prior\000" -+ .ascii " written permission.\000" -+ .ascii " \000" -+ .ascii " This software is provided 'as is' with no express or implied warranties\000" -+ .ascii " of correctness or fitness for purpose.\000" -+ .ascii " \000" -+ -+#define tlen 1024 // length of each of 4 'xor' arrays (256 32-bit words) -+ -+// offsets in context structure -+ -+#define nkey 0 // key length, size 4 -+#define nrnd 4 // number of rounds, size 4 -+#define ekey 8 // encryption key schedule base address, size 256 -+#define dkey 264 // decryption key schedule base address, size 256 -+ -+// This macro performs a forward encryption cycle. It is entered with -+// the first previous round column values in I1E, I2E, I3E and I4E and -+// exits with the final values OU1, OU2, OU3 and OU4 registers. -+ -+#define fwd_rnd(p1,p2,I1E,I1B,I1H,I2E,I2B,I2H,I3E,I3B,I3R,I4E,I4B,I4R,OU1,OU2,OU3,OU4) \ -+ movl p2(%rbp),OU1 ;\ -+ movl p2+4(%rbp),OU2 ;\ -+ movl p2+8(%rbp),OU3 ;\ -+ movl p2+12(%rbp),OU4 ;\ -+ movzbl I1B,%edi ;\ -+ movzbl I2B,%esi ;\ -+ movzbl I3B,%r8d ;\ -+ movzbl I4B,%r13d ;\ -+ shrl $8,I3E ;\ -+ shrl $8,I4E ;\ -+ xorl p1(,%rdi,4),OU1 ;\ -+ xorl p1(,%rsi,4),OU2 ;\ -+ xorl p1(,%r8,4),OU3 ;\ -+ xorl p1(,%r13,4),OU4 ;\ -+ movzbl I2H,%esi ;\ -+ movzbl I3B,%r8d ;\ -+ movzbl I4B,%r13d ;\ -+ movzbl I1H,%edi ;\ -+ shrl $8,I3E ;\ -+ shrl $8,I4E ;\ -+ xorl p1+tlen(,%rsi,4),OU1 ;\ -+ xorl p1+tlen(,%r8,4),OU2 ;\ -+ xorl p1+tlen(,%r13,4),OU3 ;\ -+ xorl p1+tlen(,%rdi,4),OU4 ;\ -+ shrl $16,I1E ;\ -+ shrl $16,I2E ;\ -+ movzbl I3B,%r8d ;\ -+ movzbl I4B,%r13d ;\ -+ movzbl I1B,%edi ;\ -+ movzbl I2B,%esi ;\ -+ xorl p1+2*tlen(,%r8,4),OU1 ;\ -+ xorl p1+2*tlen(,%r13,4),OU2 ;\ -+ xorl p1+2*tlen(,%rdi,4),OU3 ;\ -+ xorl p1+2*tlen(,%rsi,4),OU4 ;\ -+ shrl $8,I4E ;\ -+ movzbl I1H,%edi ;\ -+ movzbl I2H,%esi ;\ -+ shrl $8,I3E ;\ -+ xorl p1+3*tlen(,I4R,4),OU1 ;\ -+ xorl p1+3*tlen(,%rdi,4),OU2 ;\ -+ xorl p1+3*tlen(,%rsi,4),OU3 ;\ -+ xorl p1+3*tlen(,I3R,4),OU4 -+ -+// This macro performs an inverse encryption cycle. It is entered with -+// the first previous round column values in I1E, I2E, I3E and I4E and -+// exits with the final values OU1, OU2, OU3 and OU4 registers. -+ -+#define inv_rnd(p1,p2,I1E,I1B,I1R,I2E,I2B,I2R,I3E,I3B,I3H,I4E,I4B,I4H,OU1,OU2,OU3,OU4) \ -+ movl p2+12(%rbp),OU4 ;\ -+ movl p2+8(%rbp),OU3 ;\ -+ movl p2+4(%rbp),OU2 ;\ -+ movl p2(%rbp),OU1 ;\ -+ movzbl I4B,%edi ;\ -+ movzbl I3B,%esi ;\ -+ movzbl I2B,%r8d ;\ -+ movzbl I1B,%r13d ;\ -+ shrl $8,I2E ;\ -+ shrl $8,I1E ;\ -+ xorl p1(,%rdi,4),OU4 ;\ -+ xorl p1(,%rsi,4),OU3 ;\ -+ xorl p1(,%r8,4),OU2 ;\ -+ xorl p1(,%r13,4),OU1 ;\ -+ movzbl I3H,%esi ;\ -+ movzbl I2B,%r8d ;\ -+ movzbl I1B,%r13d ;\ -+ movzbl I4H,%edi ;\ -+ shrl $8,I2E ;\ -+ shrl $8,I1E ;\ -+ xorl p1+tlen(,%rsi,4),OU4 ;\ -+ xorl p1+tlen(,%r8,4),OU3 ;\ -+ xorl p1+tlen(,%r13,4),OU2 ;\ -+ xorl p1+tlen(,%rdi,4),OU1 ;\ -+ shrl $16,I4E ;\ -+ shrl $16,I3E ;\ -+ movzbl I2B,%r8d ;\ -+ movzbl I1B,%r13d ;\ -+ movzbl I4B,%edi ;\ -+ movzbl I3B,%esi ;\ -+ xorl p1+2*tlen(,%r8,4),OU4 ;\ -+ xorl p1+2*tlen(,%r13,4),OU3 ;\ -+ xorl p1+2*tlen(,%rdi,4),OU2 ;\ -+ xorl p1+2*tlen(,%rsi,4),OU1 ;\ -+ shrl $8,I1E ;\ -+ movzbl I4H,%edi ;\ -+ movzbl I3H,%esi ;\ -+ shrl $8,I2E ;\ -+ xorl p1+3*tlen(,I1R,4),OU4 ;\ -+ xorl p1+3*tlen(,%rdi,4),OU3 ;\ -+ xorl p1+3*tlen(,%rsi,4),OU2 ;\ -+ xorl p1+3*tlen(,I2R,4),OU1 -+ -+// AES (Rijndael) Encryption Subroutine -+ -+// rdi = pointer to AES context -+// rsi = pointer to input plaintext bytes -+// rdx = pointer to output ciphertext bytes -+ -+ .text -+ .align ALIGN64BYTES -+aes_encrypt: -+ movl (%rsi),%eax // read in plaintext -+ movl 4(%rsi),%ecx -+ movl 8(%rsi),%r10d -+ movl 12(%rsi),%r11d -+ -+ pushq %rbp -+ leaq ekey+16(%rdi),%rbp // encryption key pointer -+ movq %rdx,%r9 // pointer to out block -+ movl nrnd(%rdi),%edx // number of rounds -+ pushq %rbx -+ pushq %r13 -+ pushq %r14 -+ pushq %r15 -+ -+ xorl -16(%rbp),%eax // xor in first round key -+ xorl -12(%rbp),%ecx -+ xorl -8(%rbp),%r10d -+ xorl -4(%rbp),%r11d -+ -+ subl $10,%edx -+ je aes_15 -+ addq $32,%rbp -+ subl $2,%edx -+ je aes_13 -+ addq $32,%rbp -+ -+ fwd_rnd(aes_ft_tab,-64,%eax,%al,%ah,%ecx,%cl,%ch,%r10d,%r10b,%r10,%r11d,%r11b,%r11,%ebx,%edx,%r14d,%r15d) -+ fwd_rnd(aes_ft_tab,-48,%ebx,%bl,%bh,%edx,%dl,%dh,%r14d,%r14b,%r14,%r15d,%r15b,%r15,%eax,%ecx,%r10d,%r11d) -+ jmp aes_13 -+ .align ALIGN64BYTES -+aes_13: fwd_rnd(aes_ft_tab,-32,%eax,%al,%ah,%ecx,%cl,%ch,%r10d,%r10b,%r10,%r11d,%r11b,%r11,%ebx,%edx,%r14d,%r15d) -+ fwd_rnd(aes_ft_tab,-16,%ebx,%bl,%bh,%edx,%dl,%dh,%r14d,%r14b,%r14,%r15d,%r15b,%r15,%eax,%ecx,%r10d,%r11d) -+ jmp aes_15 -+ .align ALIGN64BYTES -+aes_15: fwd_rnd(aes_ft_tab,0, %eax,%al,%ah,%ecx,%cl,%ch,%r10d,%r10b,%r10,%r11d,%r11b,%r11,%ebx,%edx,%r14d,%r15d) -+ fwd_rnd(aes_ft_tab,16, %ebx,%bl,%bh,%edx,%dl,%dh,%r14d,%r14b,%r14,%r15d,%r15b,%r15,%eax,%ecx,%r10d,%r11d) -+ fwd_rnd(aes_ft_tab,32, %eax,%al,%ah,%ecx,%cl,%ch,%r10d,%r10b,%r10,%r11d,%r11b,%r11,%ebx,%edx,%r14d,%r15d) -+ fwd_rnd(aes_ft_tab,48, %ebx,%bl,%bh,%edx,%dl,%dh,%r14d,%r14b,%r14,%r15d,%r15b,%r15,%eax,%ecx,%r10d,%r11d) -+ fwd_rnd(aes_ft_tab,64, %eax,%al,%ah,%ecx,%cl,%ch,%r10d,%r10b,%r10,%r11d,%r11b,%r11,%ebx,%edx,%r14d,%r15d) -+ fwd_rnd(aes_ft_tab,80, %ebx,%bl,%bh,%edx,%dl,%dh,%r14d,%r14b,%r14,%r15d,%r15b,%r15,%eax,%ecx,%r10d,%r11d) -+ fwd_rnd(aes_ft_tab,96, %eax,%al,%ah,%ecx,%cl,%ch,%r10d,%r10b,%r10,%r11d,%r11b,%r11,%ebx,%edx,%r14d,%r15d) -+ fwd_rnd(aes_ft_tab,112,%ebx,%bl,%bh,%edx,%dl,%dh,%r14d,%r14b,%r14,%r15d,%r15b,%r15,%eax,%ecx,%r10d,%r11d) -+ fwd_rnd(aes_ft_tab,128,%eax,%al,%ah,%ecx,%cl,%ch,%r10d,%r10b,%r10,%r11d,%r11b,%r11,%ebx,%edx,%r14d,%r15d) -+ fwd_rnd(aes_fl_tab,144,%ebx,%bl,%bh,%edx,%dl,%dh,%r14d,%r14b,%r14,%r15d,%r15b,%r15,%eax,%ecx,%r10d,%r11d) -+ -+ popq %r15 -+ popq %r14 -+ popq %r13 -+ popq %rbx -+ popq %rbp -+ -+ movl %eax,(%r9) // move final values to the output array. -+ movl %ecx,4(%r9) -+ movl %r10d,8(%r9) -+ movl %r11d,12(%r9) -+ ret -+ -+// AES (Rijndael) Decryption Subroutine -+ -+// rdi = pointer to AES context -+// rsi = pointer to input ciphertext bytes -+// rdx = pointer to output plaintext bytes -+ -+ .align ALIGN64BYTES -+aes_decrypt: -+ movl 12(%rsi),%eax // read in ciphertext -+ movl 8(%rsi),%ecx -+ movl 4(%rsi),%r10d -+ movl (%rsi),%r11d -+ -+ pushq %rbp -+ leaq dkey+16(%rdi),%rbp // decryption key pointer -+ movq %rdx,%r9 // pointer to out block -+ movl nrnd(%rdi),%edx // number of rounds -+ pushq %rbx -+ pushq %r13 -+ pushq %r14 -+ pushq %r15 -+ -+ xorl -4(%rbp),%eax // xor in first round key -+ xorl -8(%rbp),%ecx -+ xorl -12(%rbp),%r10d -+ xorl -16(%rbp),%r11d -+ -+ subl $10,%edx -+ je aes_25 -+ addq $32,%rbp -+ subl $2,%edx -+ je aes_23 -+ addq $32,%rbp -+ -+ inv_rnd(aes_it_tab,-64,%r11d,%r11b,%r11,%r10d,%r10b,%r10,%ecx,%cl,%ch,%eax,%al,%ah,%r15d,%r14d,%edx,%ebx) -+ inv_rnd(aes_it_tab,-48,%r15d,%r15b,%r15,%r14d,%r14b,%r14,%edx,%dl,%dh,%ebx,%bl,%bh,%r11d,%r10d,%ecx,%eax) -+ jmp aes_23 -+ .align ALIGN64BYTES -+aes_23: inv_rnd(aes_it_tab,-32,%r11d,%r11b,%r11,%r10d,%r10b,%r10,%ecx,%cl,%ch,%eax,%al,%ah,%r15d,%r14d,%edx,%ebx) -+ inv_rnd(aes_it_tab,-16,%r15d,%r15b,%r15,%r14d,%r14b,%r14,%edx,%dl,%dh,%ebx,%bl,%bh,%r11d,%r10d,%ecx,%eax) -+ jmp aes_25 -+ .align ALIGN64BYTES -+aes_25: inv_rnd(aes_it_tab,0, %r11d,%r11b,%r11,%r10d,%r10b,%r10,%ecx,%cl,%ch,%eax,%al,%ah,%r15d,%r14d,%edx,%ebx) -+ inv_rnd(aes_it_tab,16, %r15d,%r15b,%r15,%r14d,%r14b,%r14,%edx,%dl,%dh,%ebx,%bl,%bh,%r11d,%r10d,%ecx,%eax) -+ inv_rnd(aes_it_tab,32, %r11d,%r11b,%r11,%r10d,%r10b,%r10,%ecx,%cl,%ch,%eax,%al,%ah,%r15d,%r14d,%edx,%ebx) -+ inv_rnd(aes_it_tab,48, %r15d,%r15b,%r15,%r14d,%r14b,%r14,%edx,%dl,%dh,%ebx,%bl,%bh,%r11d,%r10d,%ecx,%eax) -+ inv_rnd(aes_it_tab,64, %r11d,%r11b,%r11,%r10d,%r10b,%r10,%ecx,%cl,%ch,%eax,%al,%ah,%r15d,%r14d,%edx,%ebx) -+ inv_rnd(aes_it_tab,80, %r15d,%r15b,%r15,%r14d,%r14b,%r14,%edx,%dl,%dh,%ebx,%bl,%bh,%r11d,%r10d,%ecx,%eax) -+ inv_rnd(aes_it_tab,96, %r11d,%r11b,%r11,%r10d,%r10b,%r10,%ecx,%cl,%ch,%eax,%al,%ah,%r15d,%r14d,%edx,%ebx) -+ inv_rnd(aes_it_tab,112,%r15d,%r15b,%r15,%r14d,%r14b,%r14,%edx,%dl,%dh,%ebx,%bl,%bh,%r11d,%r10d,%ecx,%eax) -+ inv_rnd(aes_it_tab,128,%r11d,%r11b,%r11,%r10d,%r10b,%r10,%ecx,%cl,%ch,%eax,%al,%ah,%r15d,%r14d,%edx,%ebx) -+ inv_rnd(aes_il_tab,144,%r15d,%r15b,%r15,%r14d,%r14b,%r14,%edx,%dl,%dh,%ebx,%bl,%bh,%r11d,%r10d,%ecx,%eax) -+ -+ popq %r15 -+ popq %r14 -+ popq %r13 -+ popq %rbx -+ popq %rbp -+ -+ movl %eax,12(%r9) // move final values to the output array. -+ movl %ecx,8(%r9) -+ movl %r10d,4(%r9) -+ movl %r11d,(%r9) -+ ret -+ -+// AES (Rijndael) Key Schedule Subroutine -+ -+// This macro performs a column mixing operation on an input 32-bit -+// word to give a 32-bit result. It uses each of the 4 bytes in the -+// the input column to index 4 different tables of 256 32-bit words -+// that are xored together to form the output value. -+ -+#define mix_col(p1) \ -+ movzbl %bl,%ecx ;\ -+ movl p1(,%rcx,4),%eax ;\ -+ movzbl %bh,%ecx ;\ -+ ror $16,%ebx ;\ -+ xorl p1+tlen(,%rcx,4),%eax ;\ -+ movzbl %bl,%ecx ;\ -+ xorl p1+2*tlen(,%rcx,4),%eax ;\ -+ movzbl %bh,%ecx ;\ -+ xorl p1+3*tlen(,%rcx,4),%eax -+ -+// Key Schedule Macros -+ -+#define ksc4(p1) \ -+ rol $24,%ebx ;\ -+ mix_col(aes_fl_tab) ;\ -+ ror $8,%ebx ;\ -+ xorl 4*p1+aes_rcon_tab,%eax ;\ -+ xorl %eax,%esi ;\ -+ xorl %esi,%ebp ;\ -+ movl %esi,16*p1(%rdi) ;\ -+ movl %ebp,16*p1+4(%rdi) ;\ -+ xorl %ebp,%edx ;\ -+ xorl %edx,%ebx ;\ -+ movl %edx,16*p1+8(%rdi) ;\ -+ movl %ebx,16*p1+12(%rdi) -+ -+#define ksc6(p1) \ -+ rol $24,%ebx ;\ -+ mix_col(aes_fl_tab) ;\ -+ ror $8,%ebx ;\ -+ xorl 4*p1+aes_rcon_tab,%eax ;\ -+ xorl 24*p1-24(%rdi),%eax ;\ -+ movl %eax,24*p1(%rdi) ;\ -+ xorl 24*p1-20(%rdi),%eax ;\ -+ movl %eax,24*p1+4(%rdi) ;\ -+ xorl %eax,%esi ;\ -+ xorl %esi,%ebp ;\ -+ movl %esi,24*p1+8(%rdi) ;\ -+ movl %ebp,24*p1+12(%rdi) ;\ -+ xorl %ebp,%edx ;\ -+ xorl %edx,%ebx ;\ -+ movl %edx,24*p1+16(%rdi) ;\ -+ movl %ebx,24*p1+20(%rdi) -+ -+#define ksc8(p1) \ -+ rol $24,%ebx ;\ -+ mix_col(aes_fl_tab) ;\ -+ ror $8,%ebx ;\ -+ xorl 4*p1+aes_rcon_tab,%eax ;\ -+ xorl 32*p1-32(%rdi),%eax ;\ -+ movl %eax,32*p1(%rdi) ;\ -+ xorl 32*p1-28(%rdi),%eax ;\ -+ movl %eax,32*p1+4(%rdi) ;\ -+ xorl 32*p1-24(%rdi),%eax ;\ -+ movl %eax,32*p1+8(%rdi) ;\ -+ xorl 32*p1-20(%rdi),%eax ;\ -+ movl %eax,32*p1+12(%rdi) ;\ -+ pushq %rbx ;\ -+ movl %eax,%ebx ;\ -+ mix_col(aes_fl_tab) ;\ -+ popq %rbx ;\ -+ xorl %eax,%esi ;\ -+ xorl %esi,%ebp ;\ -+ movl %esi,32*p1+16(%rdi) ;\ -+ movl %ebp,32*p1+20(%rdi) ;\ -+ xorl %ebp,%edx ;\ -+ xorl %edx,%ebx ;\ -+ movl %edx,32*p1+24(%rdi) ;\ -+ movl %ebx,32*p1+28(%rdi) -+ -+// rdi = pointer to AES context -+// rsi = pointer to key bytes -+// rdx = key length, bytes or bits -+// rcx = ed_flag, 1=encrypt only, 0=both encrypt and decrypt -+ -+ .align ALIGN64BYTES -+aes_set_key: -+ pushfq -+ pushq %rbp -+ pushq %rbx -+ -+ movq %rcx,%r11 // ed_flg -+ movq %rdx,%rcx // key length -+ movq %rdi,%r10 // AES context -+ -+ cmpl $128,%ecx -+ jb aes_30 -+ shrl $3,%ecx -+aes_30: cmpl $32,%ecx -+ je aes_32 -+ cmpl $24,%ecx -+ je aes_32 -+ movl $16,%ecx -+aes_32: shrl $2,%ecx -+ movl %ecx,nkey(%r10) -+ leaq 6(%rcx),%rax // 10/12/14 for 4/6/8 32-bit key length -+ movl %eax,nrnd(%r10) -+ leaq ekey(%r10),%rdi // key position in AES context -+ cld -+ movl %ecx,%eax // save key length in eax -+ rep ; movsl // words in the key schedule -+ movl -4(%rsi),%ebx // put some values in registers -+ movl -8(%rsi),%edx // to allow faster code -+ movl -12(%rsi),%ebp -+ movl -16(%rsi),%esi -+ -+ cmpl $4,%eax // jump on key size -+ je aes_36 -+ cmpl $6,%eax -+ je aes_35 -+ -+ ksc8(0) -+ ksc8(1) -+ ksc8(2) -+ ksc8(3) -+ ksc8(4) -+ ksc8(5) -+ ksc8(6) -+ jmp aes_37 -+aes_35: ksc6(0) -+ ksc6(1) -+ ksc6(2) -+ ksc6(3) -+ ksc6(4) -+ ksc6(5) -+ ksc6(6) -+ ksc6(7) -+ jmp aes_37 -+aes_36: ksc4(0) -+ ksc4(1) -+ ksc4(2) -+ ksc4(3) -+ ksc4(4) -+ ksc4(5) -+ ksc4(6) -+ ksc4(7) -+ ksc4(8) -+ ksc4(9) -+aes_37: cmpl $0,%r11d // ed_flg -+ jne aes_39 -+ -+// compile decryption key schedule from encryption schedule - reverse -+// order and do mix_column operation on round keys except first and last -+ -+ movl nrnd(%r10),%eax // kt = cx->d_key + nc * cx->Nrnd -+ shl $2,%rax -+ leaq dkey(%r10,%rax,4),%rdi -+ leaq ekey(%r10),%rsi // kf = cx->e_key -+ -+ movsq // copy first round key (unmodified) -+ movsq -+ subq $32,%rdi -+ movl $1,%r9d -+aes_38: // do mix column on each column of -+ lodsl // each round key -+ movl %eax,%ebx -+ mix_col(aes_im_tab) -+ stosl -+ lodsl -+ movl %eax,%ebx -+ mix_col(aes_im_tab) -+ stosl -+ lodsl -+ movl %eax,%ebx -+ mix_col(aes_im_tab) -+ stosl -+ lodsl -+ movl %eax,%ebx -+ mix_col(aes_im_tab) -+ stosl -+ subq $32,%rdi -+ -+ incl %r9d -+ cmpl nrnd(%r10),%r9d -+ jb aes_38 -+ -+ movsq // copy last round key (unmodified) -+ movsq -+aes_39: popq %rbx -+ popq %rbp -+ popfq -+ ret -+ -+ -+// finite field multiplies by {02}, {04} and {08} -+ -+#define f2(x) ((x<<1)^(((x>>7)&1)*0x11b)) -+#define f4(x) ((x<<2)^(((x>>6)&1)*0x11b)^(((x>>6)&2)*0x11b)) -+#define f8(x) ((x<<3)^(((x>>5)&1)*0x11b)^(((x>>5)&2)*0x11b)^(((x>>5)&4)*0x11b)) -+ -+// finite field multiplies required in table generation -+ -+#define f3(x) (f2(x) ^ x) -+#define f9(x) (f8(x) ^ x) -+#define fb(x) (f8(x) ^ f2(x) ^ x) -+#define fd(x) (f8(x) ^ f4(x) ^ x) -+#define fe(x) (f8(x) ^ f4(x) ^ f2(x)) -+ -+// These defines generate the forward table entries -+ -+#define u0(x) ((f3(x) << 24) | (x << 16) | (x << 8) | f2(x)) -+#define u1(x) ((x << 24) | (x << 16) | (f2(x) << 8) | f3(x)) -+#define u2(x) ((x << 24) | (f2(x) << 16) | (f3(x) << 8) | x) -+#define u3(x) ((f2(x) << 24) | (f3(x) << 16) | (x << 8) | x) -+ -+// These defines generate the inverse table entries -+ -+#define v0(x) ((fb(x) << 24) | (fd(x) << 16) | (f9(x) << 8) | fe(x)) -+#define v1(x) ((fd(x) << 24) | (f9(x) << 16) | (fe(x) << 8) | fb(x)) -+#define v2(x) ((f9(x) << 24) | (fe(x) << 16) | (fb(x) << 8) | fd(x)) -+#define v3(x) ((fe(x) << 24) | (fb(x) << 16) | (fd(x) << 8) | f9(x)) -+ -+// These defines generate entries for the last round tables -+ -+#define w0(x) (x) -+#define w1(x) (x << 8) -+#define w2(x) (x << 16) -+#define w3(x) (x << 24) -+ -+// macro to generate inverse mix column tables (needed for the key schedule) -+ -+#define im_data0(p1) \ -+ .long p1(0x00),p1(0x01),p1(0x02),p1(0x03),p1(0x04),p1(0x05),p1(0x06),p1(0x07) ;\ -+ .long p1(0x08),p1(0x09),p1(0x0a),p1(0x0b),p1(0x0c),p1(0x0d),p1(0x0e),p1(0x0f) ;\ -+ .long p1(0x10),p1(0x11),p1(0x12),p1(0x13),p1(0x14),p1(0x15),p1(0x16),p1(0x17) ;\ -+ .long p1(0x18),p1(0x19),p1(0x1a),p1(0x1b),p1(0x1c),p1(0x1d),p1(0x1e),p1(0x1f) -+#define im_data1(p1) \ -+ .long p1(0x20),p1(0x21),p1(0x22),p1(0x23),p1(0x24),p1(0x25),p1(0x26),p1(0x27) ;\ -+ .long p1(0x28),p1(0x29),p1(0x2a),p1(0x2b),p1(0x2c),p1(0x2d),p1(0x2e),p1(0x2f) ;\ -+ .long p1(0x30),p1(0x31),p1(0x32),p1(0x33),p1(0x34),p1(0x35),p1(0x36),p1(0x37) ;\ -+ .long p1(0x38),p1(0x39),p1(0x3a),p1(0x3b),p1(0x3c),p1(0x3d),p1(0x3e),p1(0x3f) -+#define im_data2(p1) \ -+ .long p1(0x40),p1(0x41),p1(0x42),p1(0x43),p1(0x44),p1(0x45),p1(0x46),p1(0x47) ;\ -+ .long p1(0x48),p1(0x49),p1(0x4a),p1(0x4b),p1(0x4c),p1(0x4d),p1(0x4e),p1(0x4f) ;\ -+ .long p1(0x50),p1(0x51),p1(0x52),p1(0x53),p1(0x54),p1(0x55),p1(0x56),p1(0x57) ;\ -+ .long p1(0x58),p1(0x59),p1(0x5a),p1(0x5b),p1(0x5c),p1(0x5d),p1(0x5e),p1(0x5f) -+#define im_data3(p1) \ -+ .long p1(0x60),p1(0x61),p1(0x62),p1(0x63),p1(0x64),p1(0x65),p1(0x66),p1(0x67) ;\ -+ .long p1(0x68),p1(0x69),p1(0x6a),p1(0x6b),p1(0x6c),p1(0x6d),p1(0x6e),p1(0x6f) ;\ -+ .long p1(0x70),p1(0x71),p1(0x72),p1(0x73),p1(0x74),p1(0x75),p1(0x76),p1(0x77) ;\ -+ .long p1(0x78),p1(0x79),p1(0x7a),p1(0x7b),p1(0x7c),p1(0x7d),p1(0x7e),p1(0x7f) -+#define im_data4(p1) \ -+ .long p1(0x80),p1(0x81),p1(0x82),p1(0x83),p1(0x84),p1(0x85),p1(0x86),p1(0x87) ;\ -+ .long p1(0x88),p1(0x89),p1(0x8a),p1(0x8b),p1(0x8c),p1(0x8d),p1(0x8e),p1(0x8f) ;\ -+ .long p1(0x90),p1(0x91),p1(0x92),p1(0x93),p1(0x94),p1(0x95),p1(0x96),p1(0x97) ;\ -+ .long p1(0x98),p1(0x99),p1(0x9a),p1(0x9b),p1(0x9c),p1(0x9d),p1(0x9e),p1(0x9f) -+#define im_data5(p1) \ -+ .long p1(0xa0),p1(0xa1),p1(0xa2),p1(0xa3),p1(0xa4),p1(0xa5),p1(0xa6),p1(0xa7) ;\ -+ .long p1(0xa8),p1(0xa9),p1(0xaa),p1(0xab),p1(0xac),p1(0xad),p1(0xae),p1(0xaf) ;\ -+ .long p1(0xb0),p1(0xb1),p1(0xb2),p1(0xb3),p1(0xb4),p1(0xb5),p1(0xb6),p1(0xb7) ;\ -+ .long p1(0xb8),p1(0xb9),p1(0xba),p1(0xbb),p1(0xbc),p1(0xbd),p1(0xbe),p1(0xbf) -+#define im_data6(p1) \ -+ .long p1(0xc0),p1(0xc1),p1(0xc2),p1(0xc3),p1(0xc4),p1(0xc5),p1(0xc6),p1(0xc7) ;\ -+ .long p1(0xc8),p1(0xc9),p1(0xca),p1(0xcb),p1(0xcc),p1(0xcd),p1(0xce),p1(0xcf) ;\ -+ .long p1(0xd0),p1(0xd1),p1(0xd2),p1(0xd3),p1(0xd4),p1(0xd5),p1(0xd6),p1(0xd7) ;\ -+ .long p1(0xd8),p1(0xd9),p1(0xda),p1(0xdb),p1(0xdc),p1(0xdd),p1(0xde),p1(0xdf) -+#define im_data7(p1) \ -+ .long p1(0xe0),p1(0xe1),p1(0xe2),p1(0xe3),p1(0xe4),p1(0xe5),p1(0xe6),p1(0xe7) ;\ -+ .long p1(0xe8),p1(0xe9),p1(0xea),p1(0xeb),p1(0xec),p1(0xed),p1(0xee),p1(0xef) ;\ -+ .long p1(0xf0),p1(0xf1),p1(0xf2),p1(0xf3),p1(0xf4),p1(0xf5),p1(0xf6),p1(0xf7) ;\ -+ .long p1(0xf8),p1(0xf9),p1(0xfa),p1(0xfb),p1(0xfc),p1(0xfd),p1(0xfe),p1(0xff) -+ -+// S-box data - 256 entries -+ -+#define sb_data0(p1) \ -+ .long p1(0x63),p1(0x7c),p1(0x77),p1(0x7b),p1(0xf2),p1(0x6b),p1(0x6f),p1(0xc5) ;\ -+ .long p1(0x30),p1(0x01),p1(0x67),p1(0x2b),p1(0xfe),p1(0xd7),p1(0xab),p1(0x76) ;\ -+ .long p1(0xca),p1(0x82),p1(0xc9),p1(0x7d),p1(0xfa),p1(0x59),p1(0x47),p1(0xf0) ;\ -+ .long p1(0xad),p1(0xd4),p1(0xa2),p1(0xaf),p1(0x9c),p1(0xa4),p1(0x72),p1(0xc0) -+#define sb_data1(p1) \ -+ .long p1(0xb7),p1(0xfd),p1(0x93),p1(0x26),p1(0x36),p1(0x3f),p1(0xf7),p1(0xcc) ;\ -+ .long p1(0x34),p1(0xa5),p1(0xe5),p1(0xf1),p1(0x71),p1(0xd8),p1(0x31),p1(0x15) ;\ -+ .long p1(0x04),p1(0xc7),p1(0x23),p1(0xc3),p1(0x18),p1(0x96),p1(0x05),p1(0x9a) ;\ -+ .long p1(0x07),p1(0x12),p1(0x80),p1(0xe2),p1(0xeb),p1(0x27),p1(0xb2),p1(0x75) -+#define sb_data2(p1) \ -+ .long p1(0x09),p1(0x83),p1(0x2c),p1(0x1a),p1(0x1b),p1(0x6e),p1(0x5a),p1(0xa0) ;\ -+ .long p1(0x52),p1(0x3b),p1(0xd6),p1(0xb3),p1(0x29),p1(0xe3),p1(0x2f),p1(0x84) ;\ -+ .long p1(0x53),p1(0xd1),p1(0x00),p1(0xed),p1(0x20),p1(0xfc),p1(0xb1),p1(0x5b) ;\ -+ .long p1(0x6a),p1(0xcb),p1(0xbe),p1(0x39),p1(0x4a),p1(0x4c),p1(0x58),p1(0xcf) -+#define sb_data3(p1) \ -+ .long p1(0xd0),p1(0xef),p1(0xaa),p1(0xfb),p1(0x43),p1(0x4d),p1(0x33),p1(0x85) ;\ -+ .long p1(0x45),p1(0xf9),p1(0x02),p1(0x7f),p1(0x50),p1(0x3c),p1(0x9f),p1(0xa8) ;\ -+ .long p1(0x51),p1(0xa3),p1(0x40),p1(0x8f),p1(0x92),p1(0x9d),p1(0x38),p1(0xf5) ;\ -+ .long p1(0xbc),p1(0xb6),p1(0xda),p1(0x21),p1(0x10),p1(0xff),p1(0xf3),p1(0xd2) -+#define sb_data4(p1) \ -+ .long p1(0xcd),p1(0x0c),p1(0x13),p1(0xec),p1(0x5f),p1(0x97),p1(0x44),p1(0x17) ;\ -+ .long p1(0xc4),p1(0xa7),p1(0x7e),p1(0x3d),p1(0x64),p1(0x5d),p1(0x19),p1(0x73) ;\ -+ .long p1(0x60),p1(0x81),p1(0x4f),p1(0xdc),p1(0x22),p1(0x2a),p1(0x90),p1(0x88) ;\ -+ .long p1(0x46),p1(0xee),p1(0xb8),p1(0x14),p1(0xde),p1(0x5e),p1(0x0b),p1(0xdb) -+#define sb_data5(p1) \ -+ .long p1(0xe0),p1(0x32),p1(0x3a),p1(0x0a),p1(0x49),p1(0x06),p1(0x24),p1(0x5c) ;\ -+ .long p1(0xc2),p1(0xd3),p1(0xac),p1(0x62),p1(0x91),p1(0x95),p1(0xe4),p1(0x79) ;\ -+ .long p1(0xe7),p1(0xc8),p1(0x37),p1(0x6d),p1(0x8d),p1(0xd5),p1(0x4e),p1(0xa9) ;\ -+ .long p1(0x6c),p1(0x56),p1(0xf4),p1(0xea),p1(0x65),p1(0x7a),p1(0xae),p1(0x08) -+#define sb_data6(p1) \ -+ .long p1(0xba),p1(0x78),p1(0x25),p1(0x2e),p1(0x1c),p1(0xa6),p1(0xb4),p1(0xc6) ;\ -+ .long p1(0xe8),p1(0xdd),p1(0x74),p1(0x1f),p1(0x4b),p1(0xbd),p1(0x8b),p1(0x8a) ;\ -+ .long p1(0x70),p1(0x3e),p1(0xb5),p1(0x66),p1(0x48),p1(0x03),p1(0xf6),p1(0x0e) ;\ -+ .long p1(0x61),p1(0x35),p1(0x57),p1(0xb9),p1(0x86),p1(0xc1),p1(0x1d),p1(0x9e) -+#define sb_data7(p1) \ -+ .long p1(0xe1),p1(0xf8),p1(0x98),p1(0x11),p1(0x69),p1(0xd9),p1(0x8e),p1(0x94) ;\ -+ .long p1(0x9b),p1(0x1e),p1(0x87),p1(0xe9),p1(0xce),p1(0x55),p1(0x28),p1(0xdf) ;\ -+ .long p1(0x8c),p1(0xa1),p1(0x89),p1(0x0d),p1(0xbf),p1(0xe6),p1(0x42),p1(0x68) ;\ -+ .long p1(0x41),p1(0x99),p1(0x2d),p1(0x0f),p1(0xb0),p1(0x54),p1(0xbb),p1(0x16) -+ -+// Inverse S-box data - 256 entries -+ -+#define ib_data0(p1) \ -+ .long p1(0x52),p1(0x09),p1(0x6a),p1(0xd5),p1(0x30),p1(0x36),p1(0xa5),p1(0x38) ;\ -+ .long p1(0xbf),p1(0x40),p1(0xa3),p1(0x9e),p1(0x81),p1(0xf3),p1(0xd7),p1(0xfb) ;\ -+ .long p1(0x7c),p1(0xe3),p1(0x39),p1(0x82),p1(0x9b),p1(0x2f),p1(0xff),p1(0x87) ;\ -+ .long p1(0x34),p1(0x8e),p1(0x43),p1(0x44),p1(0xc4),p1(0xde),p1(0xe9),p1(0xcb) -+#define ib_data1(p1) \ -+ .long p1(0x54),p1(0x7b),p1(0x94),p1(0x32),p1(0xa6),p1(0xc2),p1(0x23),p1(0x3d) ;\ -+ .long p1(0xee),p1(0x4c),p1(0x95),p1(0x0b),p1(0x42),p1(0xfa),p1(0xc3),p1(0x4e) ;\ -+ .long p1(0x08),p1(0x2e),p1(0xa1),p1(0x66),p1(0x28),p1(0xd9),p1(0x24),p1(0xb2) ;\ -+ .long p1(0x76),p1(0x5b),p1(0xa2),p1(0x49),p1(0x6d),p1(0x8b),p1(0xd1),p1(0x25) -+#define ib_data2(p1) \ -+ .long p1(0x72),p1(0xf8),p1(0xf6),p1(0x64),p1(0x86),p1(0x68),p1(0x98),p1(0x16) ;\ -+ .long p1(0xd4),p1(0xa4),p1(0x5c),p1(0xcc),p1(0x5d),p1(0x65),p1(0xb6),p1(0x92) ;\ -+ .long p1(0x6c),p1(0x70),p1(0x48),p1(0x50),p1(0xfd),p1(0xed),p1(0xb9),p1(0xda) ;\ -+ .long p1(0x5e),p1(0x15),p1(0x46),p1(0x57),p1(0xa7),p1(0x8d),p1(0x9d),p1(0x84) -+#define ib_data3(p1) \ -+ .long p1(0x90),p1(0xd8),p1(0xab),p1(0x00),p1(0x8c),p1(0xbc),p1(0xd3),p1(0x0a) ;\ -+ .long p1(0xf7),p1(0xe4),p1(0x58),p1(0x05),p1(0xb8),p1(0xb3),p1(0x45),p1(0x06) ;\ -+ .long p1(0xd0),p1(0x2c),p1(0x1e),p1(0x8f),p1(0xca),p1(0x3f),p1(0x0f),p1(0x02) ;\ -+ .long p1(0xc1),p1(0xaf),p1(0xbd),p1(0x03),p1(0x01),p1(0x13),p1(0x8a),p1(0x6b) -+#define ib_data4(p1) \ -+ .long p1(0x3a),p1(0x91),p1(0x11),p1(0x41),p1(0x4f),p1(0x67),p1(0xdc),p1(0xea) ;\ -+ .long p1(0x97),p1(0xf2),p1(0xcf),p1(0xce),p1(0xf0),p1(0xb4),p1(0xe6),p1(0x73) ;\ -+ .long p1(0x96),p1(0xac),p1(0x74),p1(0x22),p1(0xe7),p1(0xad),p1(0x35),p1(0x85) ;\ -+ .long p1(0xe2),p1(0xf9),p1(0x37),p1(0xe8),p1(0x1c),p1(0x75),p1(0xdf),p1(0x6e) -+#define ib_data5(p1) \ -+ .long p1(0x47),p1(0xf1),p1(0x1a),p1(0x71),p1(0x1d),p1(0x29),p1(0xc5),p1(0x89) ;\ -+ .long p1(0x6f),p1(0xb7),p1(0x62),p1(0x0e),p1(0xaa),p1(0x18),p1(0xbe),p1(0x1b) ;\ -+ .long p1(0xfc),p1(0x56),p1(0x3e),p1(0x4b),p1(0xc6),p1(0xd2),p1(0x79),p1(0x20) ;\ -+ .long p1(0x9a),p1(0xdb),p1(0xc0),p1(0xfe),p1(0x78),p1(0xcd),p1(0x5a),p1(0xf4) -+#define ib_data6(p1) \ -+ .long p1(0x1f),p1(0xdd),p1(0xa8),p1(0x33),p1(0x88),p1(0x07),p1(0xc7),p1(0x31) ;\ -+ .long p1(0xb1),p1(0x12),p1(0x10),p1(0x59),p1(0x27),p1(0x80),p1(0xec),p1(0x5f) ;\ -+ .long p1(0x60),p1(0x51),p1(0x7f),p1(0xa9),p1(0x19),p1(0xb5),p1(0x4a),p1(0x0d) ;\ -+ .long p1(0x2d),p1(0xe5),p1(0x7a),p1(0x9f),p1(0x93),p1(0xc9),p1(0x9c),p1(0xef) -+#define ib_data7(p1) \ -+ .long p1(0xa0),p1(0xe0),p1(0x3b),p1(0x4d),p1(0xae),p1(0x2a),p1(0xf5),p1(0xb0) ;\ -+ .long p1(0xc8),p1(0xeb),p1(0xbb),p1(0x3c),p1(0x83),p1(0x53),p1(0x99),p1(0x61) ;\ -+ .long p1(0x17),p1(0x2b),p1(0x04),p1(0x7e),p1(0xba),p1(0x77),p1(0xd6),p1(0x26) ;\ -+ .long p1(0xe1),p1(0x69),p1(0x14),p1(0x63),p1(0x55),p1(0x21),p1(0x0c),p1(0x7d) -+ -+// The rcon_table (needed for the key schedule) -+// -+// Here is original Dr Brian Gladman's source code: -+// _rcon_tab: -+// %assign x 1 -+// %rep 29 -+// dd x -+// %assign x f2(x) -+// %endrep -+// -+// Here is precomputed output (it's more portable this way): -+ -+ .section .rodata -+ .align ALIGN64BYTES -+aes_rcon_tab: -+ .long 0x01,0x02,0x04,0x08,0x10,0x20,0x40,0x80 -+ .long 0x1b,0x36,0x6c,0xd8,0xab,0x4d,0x9a,0x2f -+ .long 0x5e,0xbc,0x63,0xc6,0x97,0x35,0x6a,0xd4 -+ .long 0xb3,0x7d,0xfa,0xef,0xc5 -+ -+// The forward xor tables -+ -+ .align ALIGN64BYTES -+aes_ft_tab: -+ sb_data0(u0) -+ sb_data1(u0) -+ sb_data2(u0) -+ sb_data3(u0) -+ sb_data4(u0) -+ sb_data5(u0) -+ sb_data6(u0) -+ sb_data7(u0) -+ -+ sb_data0(u1) -+ sb_data1(u1) -+ sb_data2(u1) -+ sb_data3(u1) -+ sb_data4(u1) -+ sb_data5(u1) -+ sb_data6(u1) -+ sb_data7(u1) -+ -+ sb_data0(u2) -+ sb_data1(u2) -+ sb_data2(u2) -+ sb_data3(u2) -+ sb_data4(u2) -+ sb_data5(u2) -+ sb_data6(u2) -+ sb_data7(u2) -+ -+ sb_data0(u3) -+ sb_data1(u3) -+ sb_data2(u3) -+ sb_data3(u3) -+ sb_data4(u3) -+ sb_data5(u3) -+ sb_data6(u3) -+ sb_data7(u3) -+ -+ .align ALIGN64BYTES -+aes_fl_tab: -+ sb_data0(w0) -+ sb_data1(w0) -+ sb_data2(w0) -+ sb_data3(w0) -+ sb_data4(w0) -+ sb_data5(w0) -+ sb_data6(w0) -+ sb_data7(w0) -+ -+ sb_data0(w1) -+ sb_data1(w1) -+ sb_data2(w1) -+ sb_data3(w1) -+ sb_data4(w1) -+ sb_data5(w1) -+ sb_data6(w1) -+ sb_data7(w1) -+ -+ sb_data0(w2) -+ sb_data1(w2) -+ sb_data2(w2) -+ sb_data3(w2) -+ sb_data4(w2) -+ sb_data5(w2) -+ sb_data6(w2) -+ sb_data7(w2) -+ -+ sb_data0(w3) -+ sb_data1(w3) -+ sb_data2(w3) -+ sb_data3(w3) -+ sb_data4(w3) -+ sb_data5(w3) -+ sb_data6(w3) -+ sb_data7(w3) -+ -+// The inverse xor tables -+ -+ .align ALIGN64BYTES -+aes_it_tab: -+ ib_data0(v0) -+ ib_data1(v0) -+ ib_data2(v0) -+ ib_data3(v0) -+ ib_data4(v0) -+ ib_data5(v0) -+ ib_data6(v0) -+ ib_data7(v0) -+ -+ ib_data0(v1) -+ ib_data1(v1) -+ ib_data2(v1) -+ ib_data3(v1) -+ ib_data4(v1) -+ ib_data5(v1) -+ ib_data6(v1) -+ ib_data7(v1) -+ -+ ib_data0(v2) -+ ib_data1(v2) -+ ib_data2(v2) -+ ib_data3(v2) -+ ib_data4(v2) -+ ib_data5(v2) -+ ib_data6(v2) -+ ib_data7(v2) -+ -+ ib_data0(v3) -+ ib_data1(v3) -+ ib_data2(v3) -+ ib_data3(v3) -+ ib_data4(v3) -+ ib_data5(v3) -+ ib_data6(v3) -+ ib_data7(v3) -+ -+ .align ALIGN64BYTES -+aes_il_tab: -+ ib_data0(w0) -+ ib_data1(w0) -+ ib_data2(w0) -+ ib_data3(w0) -+ ib_data4(w0) -+ ib_data5(w0) -+ ib_data6(w0) -+ ib_data7(w0) -+ -+ ib_data0(w1) -+ ib_data1(w1) -+ ib_data2(w1) -+ ib_data3(w1) -+ ib_data4(w1) -+ ib_data5(w1) -+ ib_data6(w1) -+ ib_data7(w1) -+ -+ ib_data0(w2) -+ ib_data1(w2) -+ ib_data2(w2) -+ ib_data3(w2) -+ ib_data4(w2) -+ ib_data5(w2) -+ ib_data6(w2) -+ ib_data7(w2) -+ -+ ib_data0(w3) -+ ib_data1(w3) -+ ib_data2(w3) -+ ib_data3(w3) -+ ib_data4(w3) -+ ib_data5(w3) -+ ib_data6(w3) -+ ib_data7(w3) -+ -+// The inverse mix column tables -+ -+ .align ALIGN64BYTES -+aes_im_tab: -+ im_data0(v0) -+ im_data1(v0) -+ im_data2(v0) -+ im_data3(v0) -+ im_data4(v0) -+ im_data5(v0) -+ im_data6(v0) -+ im_data7(v0) -+ -+ im_data0(v1) -+ im_data1(v1) -+ im_data2(v1) -+ im_data3(v1) -+ im_data4(v1) -+ im_data5(v1) -+ im_data6(v1) -+ im_data7(v1) -+ -+ im_data0(v2) -+ im_data1(v2) -+ im_data2(v2) -+ im_data3(v2) -+ im_data4(v2) -+ im_data5(v2) -+ im_data6(v2) -+ im_data7(v2) -+ -+ im_data0(v3) -+ im_data1(v3) -+ im_data2(v3) -+ im_data3(v3) -+ im_data4(v3) -+ im_data5(v3) -+ im_data6(v3) -+ im_data7(v3) -+ -+#if defined(__ELF__) && defined(SECTION_NOTE_GNU_STACK) -+ .section .note.GNU-stack,"",@progbits -+#endif -diff -urN linux-3.10-noloop/drivers/misc/aes-intel32.S linux-3.10-AES/drivers/misc/aes-intel32.S ---- linux-3.10-noloop/drivers/misc/aes-intel32.S 1970-01-01 02:00:00.000000000 +0200 -+++ linux-3.10-AES/drivers/misc/aes-intel32.S 2013-07-01 16:12:48.000000000 +0300 -@@ -0,0 +1,655 @@ -+/* -+ * Implement AES algorithm in Intel AES-NI instructions. -+ * -+ * The white paper of AES-NI instructions can be downloaded from: -+ * http://softwarecommunity.intel.com/isn/downloads/intelavx/AES-Instructions-Set_WP.pdf -+ * -+ * Copyright (C) 2008, Intel Corp. -+ * Author: Huang Ying -+ * Vinodh Gopal -+ * Kahraman Akdemir -+ * -+ * This program is free software; you can redistribute it and/or modify -+ * it under the terms of the GNU General Public License as published by -+ * the Free Software Foundation; either version 2 of the License, or -+ * (at your option) any later version. -+ */ -+ -+/* -+ * Modified by Jari Ruusu, October 2009 -+ * - Adapted for loop-AES -+ */ -+ -+/* -+ * Modified by Jari Ruusu, March 2010 -+ * - Added parallelized 4x512 CBC encrypt -+ */ -+ -+#if !defined(ALIGN64BYTES) -+# define ALIGN64BYTES 64 -+#endif -+ -+ .file "aes-intel32.S" -+ .globl intel_aes_cbc_encrypt -+ .globl intel_aes_cbc_decrypt -+ .globl intel_aes_cbc_enc_4x512 -+ .text -+ -+#define STATE1 %xmm0 -+#define STATE2 %xmm4 -+#define STATE3 %xmm5 -+#define STATE STATE1 -+#define IN1 %xmm1 -+#define IN2 %xmm7 -+#define IN3 %xmm6 -+#define IN IN1 -+#define KEY %xmm2 -+#define IV %xmm3 -+ -+#define KEYP %edi -+#define INP %esi -+#define OUTP %edx -+#define LEN %ecx -+#define IVP %ebx -+#define NRND %eax -+#define TKEYP %ebp -+ -+/* -+ * void intel_aes_cbc_encrypt(const aes_context *, void *src, void *dst, size_t len, void *iv) -+ * -+ * Stack after reg saves: 36(%esp) = void *iv -+ * 32(%esp) = size_t len -+ * 28(%esp) = void *dst -+ * 24(%esp) = void *src -+ * 20(%esp) = aes_context * -+ */ -+ .align ALIGN64BYTES -+intel_aes_cbc_encrypt: -+ push %edi -+ push %esi -+ push %ebx -+ push %ebp -+ mov 20(%esp),KEYP -+ mov 24(%esp),INP -+ mov 28(%esp),OUTP -+ mov 32(%esp),LEN -+ mov 36(%esp),IVP -+ mov 4(KEYP), NRND -+ add $8, KEYP -+ movups (IVP), STATE # load iv as initial state -+.align 4 -+.Lcbc_enc_loop: -+ movups (INP), IN # load input -+ pxor IN, STATE -+ -+ movaps (KEYP), KEY # key -+ mov KEYP, TKEYP -+ pxor KEY, STATE # round 0 -+ add $0x30, TKEYP -+ cmp $12, NRND -+ jb .Lenc128 -+ lea 0x20(TKEYP), TKEYP -+ je .Lenc192 -+ add $0x20, TKEYP -+ movaps -0x60(TKEYP), KEY -+ # aesenc KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc2 -+ movaps -0x50(TKEYP), KEY -+ # aesenc KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc2 -+.align 4 -+.Lenc192: -+ movaps -0x40(TKEYP), KEY -+ # aesenc KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc2 -+ movaps -0x30(TKEYP), KEY -+ # aesenc KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc2 -+.align 4 -+.Lenc128: -+ movaps -0x20(TKEYP), KEY -+ # aesenc KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc2 -+ movaps -0x10(TKEYP), KEY -+ # aesenc KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc2 -+ movaps (TKEYP), KEY -+ # aesenc KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc2 -+ movaps 0x10(TKEYP), KEY -+ # aesenc KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc2 -+ movaps 0x20(TKEYP), KEY -+ # aesenc KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc2 -+ movaps 0x30(TKEYP), KEY -+ # aesenc KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc2 -+ movaps 0x40(TKEYP), KEY -+ # aesenc KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc2 -+ movaps 0x50(TKEYP), KEY -+ # aesenc KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc2 -+ movaps 0x60(TKEYP), KEY -+ # aesenc KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc2 -+ movaps 0x70(TKEYP), KEY -+ # aesenclast KEY, STATE # last round -+ .byte 0x66, 0x0f, 0x38, 0xdd, 0xc2 -+ -+ movups STATE, (OUTP) # store output -+ sub $16, LEN -+ add $16, INP -+ add $16, OUTP -+ cmp $16, LEN -+ jge .Lcbc_enc_loop -+ emms -+ pop %ebp -+ pop %ebx -+ pop %esi -+ pop %edi -+ ret -+ -+/* -+ * void intel_aes_cbc_decrypt(const aes_context *, void *src, void *dst, size_t len, void *iv) -+ * -+ * Stack after reg saves: 36(%esp) = void *iv -+ * 32(%esp) = size_t len -+ * 28(%esp) = void *dst -+ * 24(%esp) = void *src -+ * 20(%esp) = aes_context * -+ */ -+ .align ALIGN64BYTES -+intel_aes_cbc_decrypt: -+ push %edi -+ push %esi -+ push %ebx -+ push %ebp -+ mov 20(%esp),KEYP -+ mov 24(%esp),INP -+ mov 28(%esp),OUTP -+ mov 32(%esp),LEN -+ mov 36(%esp),IVP -+ mov 4(KEYP), NRND -+ add $264, KEYP -+ movups (IVP), IV -+ cmp $48, LEN -+ jb .Lcbc_dec_loop1 -+.align 4 -+.Lcbc_dec_loop3: -+ movups (INP), IN1 -+ movaps IN1, STATE1 -+ movups 0x10(INP), IN2 -+ movaps IN2, STATE2 -+ movups 0x20(INP), IN3 -+ movaps IN3, STATE3 -+ -+ movaps (KEYP), KEY # key -+ mov KEYP, TKEYP -+ pxor KEY, STATE1 # round 0 -+ pxor KEY, STATE2 -+ pxor KEY, STATE3 -+ add $0x30, TKEYP -+ cmp $12, NRND -+ jb .L4dec128 -+ lea 0x20(TKEYP), TKEYP -+ je .L4dec192 -+ add $0x20, TKEYP -+ movaps -0x60(TKEYP), KEY -+ # aesdec KEY, STATE1 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ # aesdec KEY, STATE2 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xe2 -+ # aesdec KEY, STATE3 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xea -+ movaps -0x50(TKEYP), KEY -+ # aesdec KEY, STATE1 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ # aesdec KEY, STATE2 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xe2 -+ # aesdec KEY, STATE3 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xea -+.align 4 -+.L4dec192: -+ movaps -0x40(TKEYP), KEY -+ # aesdec KEY, STATE1 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ # aesdec KEY, STATE2 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xe2 -+ # aesdec KEY, STATE3 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xea -+ movaps -0x30(TKEYP), KEY -+ # aesdec KEY, STATE1 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ # aesdec KEY, STATE2 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xe2 -+ # aesdec KEY, STATE3 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xea -+.align 4 -+.L4dec128: -+ movaps -0x20(TKEYP), KEY -+ # aesdec KEY, STATE1 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ # aesdec KEY, STATE2 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xe2 -+ # aesdec KEY, STATE3 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xea -+ movaps -0x10(TKEYP), KEY -+ # aesdec KEY, STATE1 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ # aesdec KEY, STATE2 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xe2 -+ # aesdec KEY, STATE3 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xea -+ movaps (TKEYP), KEY -+ # aesdec KEY, STATE1 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ # aesdec KEY, STATE2 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xe2 -+ # aesdec KEY, STATE3 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xea -+ movaps 0x10(TKEYP), KEY -+ # aesdec KEY, STATE1 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ # aesdec KEY, STATE2 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xe2 -+ # aesdec KEY, STATE3 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xea -+ movaps 0x20(TKEYP), KEY -+ # aesdec KEY, STATE1 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ # aesdec KEY, STATE2 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xe2 -+ # aesdec KEY, STATE3 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xea -+ movaps 0x30(TKEYP), KEY -+ # aesdec KEY, STATE1 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ # aesdec KEY, STATE2 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xe2 -+ # aesdec KEY, STATE3 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xea -+ movaps 0x40(TKEYP), KEY -+ # aesdec KEY, STATE1 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ # aesdec KEY, STATE2 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xe2 -+ # aesdec KEY, STATE3 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xea -+ movaps 0x50(TKEYP), KEY -+ # aesdec KEY, STATE1 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ # aesdec KEY, STATE2 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xe2 -+ # aesdec KEY, STATE3 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xea -+ movaps 0x60(TKEYP), KEY -+ # aesdec KEY, STATE1 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ # aesdec KEY, STATE2 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xe2 -+ # aesdec KEY, STATE3 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xea -+ movaps 0x70(TKEYP), KEY -+ # aesdeclast KEY, STATE1 # last round -+ .byte 0x66, 0x0f, 0x38, 0xdf, 0xc2 -+ # aesdeclast KEY, STATE2 -+ .byte 0x66, 0x0f, 0x38, 0xdf, 0xe2 -+ # aesdeclast KEY, STATE3 -+ .byte 0x66, 0x0f, 0x38, 0xdf, 0xea -+ -+ pxor IV, STATE1 -+ pxor IN1, STATE2 -+ pxor IN2, STATE3 -+ movaps IN3, IV -+ movups STATE1, (OUTP) -+ movups STATE2, 0x10(OUTP) -+ movups STATE3, 0x20(OUTP) -+ sub $48, LEN -+ add $48, INP -+ add $48, OUTP -+ cmp $48, LEN -+ jge .Lcbc_dec_loop3 -+ cmp $16, LEN -+ jb .Lcbc_dec_ret -+.align 4 -+.Lcbc_dec_loop1: -+ movups (INP), IN -+ movaps IN, STATE -+ -+ movaps (KEYP), KEY # key -+ mov KEYP, TKEYP -+ pxor KEY, STATE # round 0 -+ add $0x30, TKEYP -+ cmp $12, NRND -+ jb .Ldec128 -+ lea 0x20(TKEYP), TKEYP -+ je .Ldec192 -+ add $0x20, TKEYP -+ movaps -0x60(TKEYP), KEY -+ # aesdec KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ movaps -0x50(TKEYP), KEY -+ # aesdec KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+.align 4 -+.Ldec192: -+ movaps -0x40(TKEYP), KEY -+ # aesdec KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ movaps -0x30(TKEYP), KEY -+ # aesdec KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+.align 4 -+.Ldec128: -+ movaps -0x20(TKEYP), KEY -+ # aesdec KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ movaps -0x10(TKEYP), KEY -+ # aesdec KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ movaps (TKEYP), KEY -+ # aesdec KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ movaps 0x10(TKEYP), KEY -+ # aesdec KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ movaps 0x20(TKEYP), KEY -+ # aesdec KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ movaps 0x30(TKEYP), KEY -+ # aesdec KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ movaps 0x40(TKEYP), KEY -+ # aesdec KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ movaps 0x50(TKEYP), KEY -+ # aesdec KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ movaps 0x60(TKEYP), KEY -+ # aesdec KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ movaps 0x70(TKEYP), KEY -+ # aesdeclast KEY, STATE # last round -+ .byte 0x66, 0x0f, 0x38, 0xdf, 0xc2 -+ -+ pxor IV, STATE -+ movups STATE, (OUTP) -+ movaps IN, IV -+ sub $16, LEN -+ add $16, INP -+ add $16, OUTP -+ cmp $16, LEN -+ jge .Lcbc_dec_loop1 -+.Lcbc_dec_ret: -+ emms -+ pop %ebp -+ pop %ebx -+ pop %esi -+ pop %edi -+ ret -+ -+/* -+ * void intel_aes_cbc_enc_4x512(aes_context **, void *src, void *dst, void *iv) -+ * -+ * Stack after reg saves: 32(%esp) = void *iv -+ * 28(%esp) = void *dst -+ * 24(%esp) = void *src -+ * 20(%esp) = aes_context ** -+ */ -+ .align ALIGN64BYTES -+intel_aes_cbc_enc_4x512: -+ push %edi -+ push %esi -+ push %ebx -+ push %ebp -+ mov 20(%esp),%edi -+ mov 32(%esp),%esi -+ mov 24(%esp),%ebp -+ mov (%edi),%eax # pointer to context struct 1 -+ mov 4(%edi),%ebx # pointer to context struct 2 -+ mov 8(%edi),%ecx # pointer to context struct 3 -+ mov 12(%edi),%edx # pointer to context struct 4 -+ mov 4(%eax),%edi # number of rounds (10/12/14) -+ movups (%esi),%xmm0 # load IV as initial state -+ movups 0x10(%esi),%xmm1 -+ movups 0x20(%esi),%xmm2 -+ movups 0x30(%esi),%xmm3 -+ sub $10,%edi -+ mov $0x200,%esi # 512 byte CBC chain -+ shl $4,%edi -+ add $0x38,%edi # 0x38 / 0x58 / 0x78 -+.align 4 -+.Lcbc_enc_loop4: -+ movups (%ebp),%xmm4 # load input -+ movups 0x200(%ebp),%xmm5 -+ movups 0x400(%ebp),%xmm6 -+ movups 0x600(%ebp),%xmm7 -+ add $16,%ebp -+ mov %ebp,24(%esp) -+ mov 28(%esp),%ebp -+ pxor %xmm4,%xmm0 # CBC-mode XOR -+ pxor %xmm5,%xmm1 -+ pxor %xmm6,%xmm2 -+ pxor %xmm7,%xmm3 -+ -+ movaps 0x08(%eax),%xmm4 # round 0 key -+ movaps 0x08(%ebx),%xmm5 -+ movaps 0x08(%ecx),%xmm6 -+ movaps 0x08(%edx),%xmm7 -+ pxor %xmm4,%xmm0 # round 0 XOR -+ pxor %xmm5,%xmm1 -+ pxor %xmm6,%xmm2 -+ pxor %xmm7,%xmm3 -+ -+ cmp $0x58,%edi -+ jb .L4enc128 -+ je .L4enc192 -+ -+ movaps -0x60(%eax,%edi,1),%xmm4 -+ movaps -0x60(%ebx,%edi,1),%xmm5 -+ movaps -0x60(%ecx,%edi,1),%xmm6 -+ movaps -0x60(%edx,%edi,1),%xmm7 -+ # aesenc %xmm4,%xmm0 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc4 -+ # aesenc %xmm5,%xmm1 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xcd -+ # aesenc %xmm6,%xmm2 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xd6 -+ # aesenc %xmm7,%xmm3 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xdf -+ -+ movaps -0x50(%eax,%edi,1),%xmm4 -+ movaps -0x50(%ebx,%edi,1),%xmm5 -+ movaps -0x50(%ecx,%edi,1),%xmm6 -+ movaps -0x50(%edx,%edi,1),%xmm7 -+ # aesenc %xmm4,%xmm0 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc4 -+ # aesenc %xmm5,%xmm1 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xcd -+ # aesenc %xmm6,%xmm2 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xd6 -+ # aesenc %xmm7,%xmm3 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xdf -+ -+.align 4 -+.L4enc192: -+ movaps -0x40(%eax,%edi,1),%xmm4 -+ movaps -0x40(%ebx,%edi,1),%xmm5 -+ movaps -0x40(%ecx,%edi,1),%xmm6 -+ movaps -0x40(%edx,%edi,1),%xmm7 -+ # aesenc %xmm4,%xmm0 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc4 -+ # aesenc %xmm5,%xmm1 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xcd -+ # aesenc %xmm6,%xmm2 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xd6 -+ # aesenc %xmm7,%xmm3 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xdf -+ -+ movaps -0x30(%eax,%edi,1),%xmm4 -+ movaps -0x30(%ebx,%edi,1),%xmm5 -+ movaps -0x30(%ecx,%edi,1),%xmm6 -+ movaps -0x30(%edx,%edi,1),%xmm7 -+ # aesenc %xmm4,%xmm0 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc4 -+ # aesenc %xmm5,%xmm1 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xcd -+ # aesenc %xmm6,%xmm2 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xd6 -+ # aesenc %xmm7,%xmm3 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xdf -+ -+.align 4 -+.L4enc128: -+ movaps -0x20(%eax,%edi,1),%xmm4 -+ movaps -0x20(%ebx,%edi,1),%xmm5 -+ movaps -0x20(%ecx,%edi,1),%xmm6 -+ movaps -0x20(%edx,%edi,1),%xmm7 -+ # aesenc %xmm4,%xmm0 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc4 -+ # aesenc %xmm5,%xmm1 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xcd -+ # aesenc %xmm6,%xmm2 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xd6 -+ # aesenc %xmm7,%xmm3 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xdf -+ -+ movaps -0x10(%eax,%edi,1),%xmm4 -+ movaps -0x10(%ebx,%edi,1),%xmm5 -+ movaps -0x10(%ecx,%edi,1),%xmm6 -+ movaps -0x10(%edx,%edi,1),%xmm7 -+ # aesenc %xmm4,%xmm0 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc4 -+ # aesenc %xmm5,%xmm1 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xcd -+ # aesenc %xmm6,%xmm2 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xd6 -+ # aesenc %xmm7,%xmm3 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xdf -+ -+ movaps (%eax,%edi,1),%xmm4 -+ movaps (%ebx,%edi,1),%xmm5 -+ movaps (%ecx,%edi,1),%xmm6 -+ movaps (%edx,%edi,1),%xmm7 -+ # aesenc %xmm4,%xmm0 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc4 -+ # aesenc %xmm5,%xmm1 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xcd -+ # aesenc %xmm6,%xmm2 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xd6 -+ # aesenc %xmm7,%xmm3 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xdf -+ -+ movaps 0x10(%eax,%edi,1),%xmm4 -+ movaps 0x10(%ebx,%edi,1),%xmm5 -+ movaps 0x10(%ecx,%edi,1),%xmm6 -+ movaps 0x10(%edx,%edi,1),%xmm7 -+ # aesenc %xmm4,%xmm0 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc4 -+ # aesenc %xmm5,%xmm1 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xcd -+ # aesenc %xmm6,%xmm2 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xd6 -+ # aesenc %xmm7,%xmm3 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xdf -+ -+ movaps 0x20(%eax,%edi,1),%xmm4 -+ movaps 0x20(%ebx,%edi,1),%xmm5 -+ movaps 0x20(%ecx,%edi,1),%xmm6 -+ movaps 0x20(%edx,%edi,1),%xmm7 -+ # aesenc %xmm4,%xmm0 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc4 -+ # aesenc %xmm5,%xmm1 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xcd -+ # aesenc %xmm6,%xmm2 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xd6 -+ # aesenc %xmm7,%xmm3 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xdf -+ -+ movaps 0x30(%eax,%edi,1),%xmm4 -+ movaps 0x30(%ebx,%edi,1),%xmm5 -+ movaps 0x30(%ecx,%edi,1),%xmm6 -+ movaps 0x30(%edx,%edi,1),%xmm7 -+ # aesenc %xmm4,%xmm0 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc4 -+ # aesenc %xmm5,%xmm1 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xcd -+ # aesenc %xmm6,%xmm2 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xd6 -+ # aesenc %xmm7,%xmm3 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xdf -+ -+ movaps 0x40(%eax,%edi,1),%xmm4 -+ movaps 0x40(%ebx,%edi,1),%xmm5 -+ movaps 0x40(%ecx,%edi,1),%xmm6 -+ movaps 0x40(%edx,%edi,1),%xmm7 -+ # aesenc %xmm4,%xmm0 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc4 -+ # aesenc %xmm5,%xmm1 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xcd -+ # aesenc %xmm6,%xmm2 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xd6 -+ # aesenc %xmm7,%xmm3 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xdf -+ -+ movaps 0x50(%eax,%edi,1),%xmm4 -+ movaps 0x50(%ebx,%edi,1),%xmm5 -+ movaps 0x50(%ecx,%edi,1),%xmm6 -+ movaps 0x50(%edx,%edi,1),%xmm7 -+ # aesenc %xmm4,%xmm0 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc4 -+ # aesenc %xmm5,%xmm1 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xcd -+ # aesenc %xmm6,%xmm2 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xd6 -+ # aesenc %xmm7,%xmm3 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xdf -+ -+ movaps 0x60(%eax,%edi,1),%xmm4 -+ movaps 0x60(%ebx,%edi,1),%xmm5 -+ movaps 0x60(%ecx,%edi,1),%xmm6 -+ movaps 0x60(%edx,%edi,1),%xmm7 -+ # aesenc %xmm4,%xmm0 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc4 -+ # aesenc %xmm5,%xmm1 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xcd -+ # aesenc %xmm6,%xmm2 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xd6 -+ # aesenc %xmm7,%xmm3 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xdf -+ -+ movaps 0x70(%eax,%edi,1),%xmm4 -+ movaps 0x70(%ebx,%edi,1),%xmm5 -+ movaps 0x70(%ecx,%edi,1),%xmm6 -+ movaps 0x70(%edx,%edi,1),%xmm7 -+ # aesenclast %xmm4,%xmm0 # last round -+ .byte 0x66, 0x0f, 0x38, 0xdd, 0xc4 -+ # aesenclast %xmm5,%xmm1 -+ .byte 0x66, 0x0f, 0x38, 0xdd, 0xcd -+ # aesenclast %xmm6,%xmm2 -+ .byte 0x66, 0x0f, 0x38, 0xdd, 0xd6 -+ # aesenclast %xmm7,%xmm3 -+ .byte 0x66, 0x0f, 0x38, 0xdd, 0xdf -+ -+ sub $16,%esi -+ movups %xmm0,(%ebp) # store output -+ movups %xmm1,0x200(%ebp) -+ movups %xmm2,0x400(%ebp) -+ movups %xmm3,0x600(%ebp) -+ add $16,%ebp -+ mov %ebp,28(%esp) -+ mov 24(%esp),%ebp -+ cmp $16,%esi -+ jge .Lcbc_enc_loop4 -+ emms -+ pop %ebp -+ pop %ebx -+ pop %esi -+ pop %edi -+ ret -+ -+#if defined(__ELF__) && defined(SECTION_NOTE_GNU_STACK) -+ .section .note.GNU-stack,"",@progbits -+#endif -diff -urN linux-3.10-noloop/drivers/misc/aes-intel64.S linux-3.10-AES/drivers/misc/aes-intel64.S ---- linux-3.10-noloop/drivers/misc/aes-intel64.S 1970-01-01 02:00:00.000000000 +0200 -+++ linux-3.10-AES/drivers/misc/aes-intel64.S 2013-07-01 16:12:48.000000000 +0300 -@@ -0,0 +1,649 @@ -+/* -+ * Implement AES algorithm in Intel AES-NI instructions. -+ * -+ * The white paper of AES-NI instructions can be downloaded from: -+ * http://softwarecommunity.intel.com/isn/downloads/intelavx/AES-Instructions-Set_WP.pdf -+ * -+ * Copyright (C) 2008, Intel Corp. -+ * Author: Huang Ying -+ * Vinodh Gopal -+ * Kahraman Akdemir -+ * -+ * This program is free software; you can redistribute it and/or modify -+ * it under the terms of the GNU General Public License as published by -+ * the Free Software Foundation; either version 2 of the License, or -+ * (at your option) any later version. -+ */ -+ -+/* -+ * Modified by Jari Ruusu, October 2009 -+ * - Adapted for loop-AES -+ */ -+ -+/* -+ * Modified by Jari Ruusu, March 2010 -+ * - Added parallelized 4x512 CBC encrypt -+ */ -+ -+#if !defined(ALIGN64BYTES) -+# define ALIGN64BYTES 64 -+#endif -+ -+ .file "aes-intel64.S" -+ .globl intel_aes_cbc_encrypt -+ .globl intel_aes_cbc_decrypt -+ .globl intel_aes_cbc_enc_4x512 -+ .text -+ -+#define STATE1 %xmm0 -+#define STATE2 %xmm4 -+#define STATE3 %xmm5 -+#define STATE4 %xmm6 -+#define STATE STATE1 -+#define IN1 %xmm1 -+#define IN2 %xmm7 -+#define IN3 %xmm8 -+#define IN4 %xmm9 -+#define IN IN1 -+#define KEY %xmm2 -+#define IV %xmm3 -+ -+#define KEYP %rdi -+#define INP %rsi -+#define OUTP %rdx -+#define LEN %rcx -+#define IVP %r8 -+#define NRND %r9d -+#define TKEYP %r10 -+ -+/* -+ * void intel_aes_cbc_encrypt(const aes_context *, void *src, void *dst, size_t len, void *iv) -+ * -+ * Parameters: %rdi = aes_context * -+ * %rsi = void *src -+ * %rdx = void *dst -+ * %rcx = size_t len -+ * %r8 = void *iv -+ */ -+ .align ALIGN64BYTES -+intel_aes_cbc_encrypt: -+ mov 4(KEYP), NRND -+ add $8, KEYP -+ movups (IVP), STATE # load iv as initial state -+.align 4 -+.Lcbc_enc_loop: -+ movups (INP), IN # load input -+ pxor IN, STATE -+ -+ movaps (KEYP), KEY # key -+ mov KEYP, TKEYP -+ pxor KEY, STATE # round 0 -+ add $0x30, TKEYP -+ cmp $12, NRND -+ jb .Lenc128 -+ lea 0x20(TKEYP), TKEYP -+ je .Lenc192 -+ add $0x20, TKEYP -+ movaps -0x60(TKEYP), KEY -+ # aesenc KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc2 -+ movaps -0x50(TKEYP), KEY -+ # aesenc KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc2 -+.align 4 -+.Lenc192: -+ movaps -0x40(TKEYP), KEY -+ # aesenc KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc2 -+ movaps -0x30(TKEYP), KEY -+ # aesenc KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc2 -+.align 4 -+.Lenc128: -+ movaps -0x20(TKEYP), KEY -+ # aesenc KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc2 -+ movaps -0x10(TKEYP), KEY -+ # aesenc KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc2 -+ movaps (TKEYP), KEY -+ # aesenc KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc2 -+ movaps 0x10(TKEYP), KEY -+ # aesenc KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc2 -+ movaps 0x20(TKEYP), KEY -+ # aesenc KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc2 -+ movaps 0x30(TKEYP), KEY -+ # aesenc KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc2 -+ movaps 0x40(TKEYP), KEY -+ # aesenc KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc2 -+ movaps 0x50(TKEYP), KEY -+ # aesenc KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc2 -+ movaps 0x60(TKEYP), KEY -+ # aesenc KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc2 -+ movaps 0x70(TKEYP), KEY -+ # aesenclast KEY, STATE # last round -+ .byte 0x66, 0x0f, 0x38, 0xdd, 0xc2 -+ -+ movups STATE, (OUTP) # store output -+ sub $16, LEN -+ add $16, INP -+ add $16, OUTP -+ cmp $16, LEN -+ jge .Lcbc_enc_loop -+ emms -+ ret -+ -+/* -+ * void intel_aes_cbc_decrypt(const aes_context *, void *src, void *dst, size_t len, void *iv) -+ * -+ * Parameters: %rdi = aes_context * -+ * %rsi = void *src -+ * %rdx = void *dst -+ * %rcx = size_t len -+ * %r8 = void *iv -+ */ -+ .align ALIGN64BYTES -+intel_aes_cbc_decrypt: -+ mov 4(KEYP), NRND -+ add $264, KEYP -+ movups (IVP), IV -+ cmp $64, LEN -+ jb .Lcbc_dec_loop1 -+.align 4 -+.Lcbc_dec_loop4: -+ movups (INP), IN1 -+ movaps IN1, STATE1 -+ movups 0x10(INP), IN2 -+ movaps IN2, STATE2 -+ movups 0x20(INP), IN3 -+ movaps IN3, STATE3 -+ movups 0x30(INP), IN4 -+ movaps IN4, STATE4 -+ -+ movaps (KEYP), KEY # key -+ mov KEYP, TKEYP -+ pxor KEY, STATE1 # round 0 -+ pxor KEY, STATE2 -+ pxor KEY, STATE3 -+ pxor KEY, STATE4 -+ add $0x30, TKEYP -+ cmp $12, NRND -+ jb .L4dec128 -+ lea 0x20(TKEYP), TKEYP -+ je .L4dec192 -+ add $0x20, TKEYP -+ movaps -0x60(TKEYP), KEY -+ # aesdec KEY, STATE1 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ # aesdec KEY, STATE2 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xe2 -+ # aesdec KEY, STATE3 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xea -+ # aesdec KEY, STATE4 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xf2 -+ movaps -0x50(TKEYP), KEY -+ # aesdec KEY, STATE1 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ # aesdec KEY, STATE2 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xe2 -+ # aesdec KEY, STATE3 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xea -+ # aesdec KEY, STATE4 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xf2 -+.align 4 -+.L4dec192: -+ movaps -0x40(TKEYP), KEY -+ # aesdec KEY, STATE1 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ # aesdec KEY, STATE2 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xe2 -+ # aesdec KEY, STATE3 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xea -+ # aesdec KEY, STATE4 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xf2 -+ movaps -0x30(TKEYP), KEY -+ # aesdec KEY, STATE1 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ # aesdec KEY, STATE2 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xe2 -+ # aesdec KEY, STATE3 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xea -+ # aesdec KEY, STATE4 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xf2 -+.align 4 -+.L4dec128: -+ movaps -0x20(TKEYP), KEY -+ # aesdec KEY, STATE1 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ # aesdec KEY, STATE2 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xe2 -+ # aesdec KEY, STATE3 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xea -+ # aesdec KEY, STATE4 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xf2 -+ movaps -0x10(TKEYP), KEY -+ # aesdec KEY, STATE1 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ # aesdec KEY, STATE2 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xe2 -+ # aesdec KEY, STATE3 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xea -+ # aesdec KEY, STATE4 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xf2 -+ movaps (TKEYP), KEY -+ # aesdec KEY, STATE1 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ # aesdec KEY, STATE2 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xe2 -+ # aesdec KEY, STATE3 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xea -+ # aesdec KEY, STATE4 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xf2 -+ movaps 0x10(TKEYP), KEY -+ # aesdec KEY, STATE1 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ # aesdec KEY, STATE2 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xe2 -+ # aesdec KEY, STATE3 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xea -+ # aesdec KEY, STATE4 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xf2 -+ movaps 0x20(TKEYP), KEY -+ # aesdec KEY, STATE1 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ # aesdec KEY, STATE2 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xe2 -+ # aesdec KEY, STATE3 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xea -+ # aesdec KEY, STATE4 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xf2 -+ movaps 0x30(TKEYP), KEY -+ # aesdec KEY, STATE1 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ # aesdec KEY, STATE2 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xe2 -+ # aesdec KEY, STATE3 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xea -+ # aesdec KEY, STATE4 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xf2 -+ movaps 0x40(TKEYP), KEY -+ # aesdec KEY, STATE1 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ # aesdec KEY, STATE2 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xe2 -+ # aesdec KEY, STATE3 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xea -+ # aesdec KEY, STATE4 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xf2 -+ movaps 0x50(TKEYP), KEY -+ # aesdec KEY, STATE1 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ # aesdec KEY, STATE2 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xe2 -+ # aesdec KEY, STATE3 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xea -+ # aesdec KEY, STATE4 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xf2 -+ movaps 0x60(TKEYP), KEY -+ # aesdec KEY, STATE1 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ # aesdec KEY, STATE2 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xe2 -+ # aesdec KEY, STATE3 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xea -+ # aesdec KEY, STATE4 -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xf2 -+ movaps 0x70(TKEYP), KEY -+ # aesdeclast KEY, STATE1 # last round -+ .byte 0x66, 0x0f, 0x38, 0xdf, 0xc2 -+ # aesdeclast KEY, STATE2 -+ .byte 0x66, 0x0f, 0x38, 0xdf, 0xe2 -+ # aesdeclast KEY, STATE3 -+ .byte 0x66, 0x0f, 0x38, 0xdf, 0xea -+ # aesdeclast KEY, STATE4 -+ .byte 0x66, 0x0f, 0x38, 0xdf, 0xf2 -+ -+ pxor IV, STATE1 -+ pxor IN1, STATE2 -+ pxor IN2, STATE3 -+ pxor IN3, STATE4 -+ movaps IN4, IV -+ movups STATE1, (OUTP) -+ movups STATE2, 0x10(OUTP) -+ movups STATE3, 0x20(OUTP) -+ movups STATE4, 0x30(OUTP) -+ sub $64, LEN -+ add $64, INP -+ add $64, OUTP -+ cmp $64, LEN -+ jge .Lcbc_dec_loop4 -+ cmp $16, LEN -+ jb .Lcbc_dec_ret -+.align 4 -+.Lcbc_dec_loop1: -+ movups (INP), IN -+ movaps IN, STATE -+ -+ movaps (KEYP), KEY # key -+ mov KEYP, TKEYP -+ pxor KEY, STATE # round 0 -+ add $0x30, TKEYP -+ cmp $12, NRND -+ jb .Ldec128 -+ lea 0x20(TKEYP), TKEYP -+ je .Ldec192 -+ add $0x20, TKEYP -+ movaps -0x60(TKEYP), KEY -+ # aesdec KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ movaps -0x50(TKEYP), KEY -+ # aesdec KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+.align 4 -+.Ldec192: -+ movaps -0x40(TKEYP), KEY -+ # aesdec KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ movaps -0x30(TKEYP), KEY -+ # aesdec KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+.align 4 -+.Ldec128: -+ movaps -0x20(TKEYP), KEY -+ # aesdec KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ movaps -0x10(TKEYP), KEY -+ # aesdec KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ movaps (TKEYP), KEY -+ # aesdec KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ movaps 0x10(TKEYP), KEY -+ # aesdec KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ movaps 0x20(TKEYP), KEY -+ # aesdec KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ movaps 0x30(TKEYP), KEY -+ # aesdec KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ movaps 0x40(TKEYP), KEY -+ # aesdec KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ movaps 0x50(TKEYP), KEY -+ # aesdec KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ movaps 0x60(TKEYP), KEY -+ # aesdec KEY, STATE -+ .byte 0x66, 0x0f, 0x38, 0xde, 0xc2 -+ movaps 0x70(TKEYP), KEY -+ # aesdeclast KEY, STATE # last round -+ .byte 0x66, 0x0f, 0x38, 0xdf, 0xc2 -+ -+ pxor IV, STATE -+ movups STATE, (OUTP) -+ movaps IN, IV -+ sub $16, LEN -+ add $16, INP -+ add $16, OUTP -+ cmp $16, LEN -+ jge .Lcbc_dec_loop1 -+.Lcbc_dec_ret: -+ emms -+ ret -+ -+/* -+ * void intel_aes_cbc_enc_4x512(aes_context **, void *src, void *dst, void *iv) -+ * -+ * Parameters: %rdi = aes_context ** -+ * %rsi = void *src -+ * %rdx = void *dst -+ * %rcx = void *iv -+ */ -+ .align ALIGN64BYTES -+intel_aes_cbc_enc_4x512: -+ mov (%rdi),%rax # pointer to context struct 1 -+ mov 8(%rdi),%r8 # pointer to context struct 2 -+ mov 16(%rdi),%r9 # pointer to context struct 3 -+ mov 24(%rdi),%r10 # pointer to context struct 4 -+ mov 4(%rax),%edi # number of rounds (10/12/14) -+ movups (%rcx),%xmm0 # load IV as initial state -+ movups 0x10(%rcx),%xmm1 -+ movups 0x20(%rcx),%xmm2 -+ movups 0x30(%rcx),%xmm3 -+ sub $10,%edi -+ mov $0x200,%ecx # 512 byte CBC chain -+ shl $4,%edi -+ add $0x38,%edi # 0x38 / 0x58 / 0x78 -+.align 4 -+.Lcbc_enc_loop4: -+ movups (%rsi),%xmm4 # load input -+ movups 0x200(%rsi),%xmm5 -+ movups 0x400(%rsi),%xmm6 -+ movups 0x600(%rsi),%xmm7 -+ add $16,%rsi -+ pxor %xmm4,%xmm0 # CBC-mode XOR -+ pxor %xmm5,%xmm1 -+ pxor %xmm6,%xmm2 -+ pxor %xmm7,%xmm3 -+ -+ movaps 0x08(%rax),%xmm4 # round 0 key -+ movaps 0x08(%r8),%xmm5 -+ movaps 0x08(%r9),%xmm6 -+ movaps 0x08(%r10),%xmm7 -+ pxor %xmm4,%xmm0 # round 0 XOR -+ pxor %xmm5,%xmm1 -+ pxor %xmm6,%xmm2 -+ pxor %xmm7,%xmm3 -+ -+ cmp $0x58,%edi -+ jb .L4enc128 -+ je .L4enc192 -+ -+ movaps -0x60(%rax,%rdi,1),%xmm4 -+ movaps -0x60(%r8,%rdi,1),%xmm5 -+ movaps -0x60(%r9,%rdi,1),%xmm6 -+ movaps -0x60(%r10,%rdi,1),%xmm7 -+ # aesenc %xmm4,%xmm0 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc4 -+ # aesenc %xmm5,%xmm1 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xcd -+ # aesenc %xmm6,%xmm2 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xd6 -+ # aesenc %xmm7,%xmm3 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xdf -+ -+ movaps -0x50(%rax,%rdi,1),%xmm4 -+ movaps -0x50(%r8,%rdi,1),%xmm5 -+ movaps -0x50(%r9,%rdi,1),%xmm6 -+ movaps -0x50(%r10,%rdi,1),%xmm7 -+ # aesenc %xmm4,%xmm0 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc4 -+ # aesenc %xmm5,%xmm1 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xcd -+ # aesenc %xmm6,%xmm2 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xd6 -+ # aesenc %xmm7,%xmm3 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xdf -+ -+.align 4 -+.L4enc192: -+ movaps -0x40(%rax,%rdi,1),%xmm4 -+ movaps -0x40(%r8,%rdi,1),%xmm5 -+ movaps -0x40(%r9,%rdi,1),%xmm6 -+ movaps -0x40(%r10,%rdi,1),%xmm7 -+ # aesenc %xmm4,%xmm0 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc4 -+ # aesenc %xmm5,%xmm1 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xcd -+ # aesenc %xmm6,%xmm2 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xd6 -+ # aesenc %xmm7,%xmm3 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xdf -+ -+ movaps -0x30(%rax,%rdi,1),%xmm4 -+ movaps -0x30(%r8,%rdi,1),%xmm5 -+ movaps -0x30(%r9,%rdi,1),%xmm6 -+ movaps -0x30(%r10,%rdi,1),%xmm7 -+ # aesenc %xmm4,%xmm0 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc4 -+ # aesenc %xmm5,%xmm1 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xcd -+ # aesenc %xmm6,%xmm2 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xd6 -+ # aesenc %xmm7,%xmm3 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xdf -+ -+.align 4 -+.L4enc128: -+ movaps -0x20(%rax,%rdi,1),%xmm4 -+ movaps -0x20(%r8,%rdi,1),%xmm5 -+ movaps -0x20(%r9,%rdi,1),%xmm6 -+ movaps -0x20(%r10,%rdi,1),%xmm7 -+ # aesenc %xmm4,%xmm0 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc4 -+ # aesenc %xmm5,%xmm1 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xcd -+ # aesenc %xmm6,%xmm2 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xd6 -+ # aesenc %xmm7,%xmm3 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xdf -+ -+ movaps -0x10(%rax,%rdi,1),%xmm4 -+ movaps -0x10(%r8,%rdi,1),%xmm5 -+ movaps -0x10(%r9,%rdi,1),%xmm6 -+ movaps -0x10(%r10,%rdi,1),%xmm7 -+ # aesenc %xmm4,%xmm0 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc4 -+ # aesenc %xmm5,%xmm1 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xcd -+ # aesenc %xmm6,%xmm2 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xd6 -+ # aesenc %xmm7,%xmm3 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xdf -+ -+ movaps (%rax,%rdi,1),%xmm4 -+ movaps (%r8,%rdi,1),%xmm5 -+ movaps (%r9,%rdi,1),%xmm6 -+ movaps (%r10,%rdi,1),%xmm7 -+ # aesenc %xmm4,%xmm0 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc4 -+ # aesenc %xmm5,%xmm1 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xcd -+ # aesenc %xmm6,%xmm2 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xd6 -+ # aesenc %xmm7,%xmm3 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xdf -+ -+ movaps 0x10(%rax,%rdi,1),%xmm4 -+ movaps 0x10(%r8,%rdi,1),%xmm5 -+ movaps 0x10(%r9,%rdi,1),%xmm6 -+ movaps 0x10(%r10,%rdi,1),%xmm7 -+ # aesenc %xmm4,%xmm0 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc4 -+ # aesenc %xmm5,%xmm1 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xcd -+ # aesenc %xmm6,%xmm2 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xd6 -+ # aesenc %xmm7,%xmm3 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xdf -+ -+ movaps 0x20(%rax,%rdi,1),%xmm4 -+ movaps 0x20(%r8,%rdi,1),%xmm5 -+ movaps 0x20(%r9,%rdi,1),%xmm6 -+ movaps 0x20(%r10,%rdi,1),%xmm7 -+ # aesenc %xmm4,%xmm0 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc4 -+ # aesenc %xmm5,%xmm1 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xcd -+ # aesenc %xmm6,%xmm2 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xd6 -+ # aesenc %xmm7,%xmm3 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xdf -+ -+ movaps 0x30(%rax,%rdi,1),%xmm4 -+ movaps 0x30(%r8,%rdi,1),%xmm5 -+ movaps 0x30(%r9,%rdi,1),%xmm6 -+ movaps 0x30(%r10,%rdi,1),%xmm7 -+ # aesenc %xmm4,%xmm0 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc4 -+ # aesenc %xmm5,%xmm1 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xcd -+ # aesenc %xmm6,%xmm2 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xd6 -+ # aesenc %xmm7,%xmm3 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xdf -+ -+ movaps 0x40(%rax,%rdi,1),%xmm4 -+ movaps 0x40(%r8,%rdi,1),%xmm5 -+ movaps 0x40(%r9,%rdi,1),%xmm6 -+ movaps 0x40(%r10,%rdi,1),%xmm7 -+ # aesenc %xmm4,%xmm0 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc4 -+ # aesenc %xmm5,%xmm1 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xcd -+ # aesenc %xmm6,%xmm2 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xd6 -+ # aesenc %xmm7,%xmm3 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xdf -+ -+ movaps 0x50(%rax,%rdi,1),%xmm4 -+ movaps 0x50(%r8,%rdi,1),%xmm5 -+ movaps 0x50(%r9,%rdi,1),%xmm6 -+ movaps 0x50(%r10,%rdi,1),%xmm7 -+ # aesenc %xmm4,%xmm0 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc4 -+ # aesenc %xmm5,%xmm1 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xcd -+ # aesenc %xmm6,%xmm2 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xd6 -+ # aesenc %xmm7,%xmm3 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xdf -+ -+ movaps 0x60(%rax,%rdi,1),%xmm4 -+ movaps 0x60(%r8,%rdi,1),%xmm5 -+ movaps 0x60(%r9,%rdi,1),%xmm6 -+ movaps 0x60(%r10,%rdi,1),%xmm7 -+ # aesenc %xmm4,%xmm0 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xc4 -+ # aesenc %xmm5,%xmm1 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xcd -+ # aesenc %xmm6,%xmm2 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xd6 -+ # aesenc %xmm7,%xmm3 -+ .byte 0x66, 0x0f, 0x38, 0xdc, 0xdf -+ -+ movaps 0x70(%rax,%rdi,1),%xmm4 -+ movaps 0x70(%r8,%rdi,1),%xmm5 -+ movaps 0x70(%r9,%rdi,1),%xmm6 -+ movaps 0x70(%r10,%rdi,1),%xmm7 -+ # aesenclast %xmm4,%xmm0 # last round -+ .byte 0x66, 0x0f, 0x38, 0xdd, 0xc4 -+ # aesenclast %xmm5,%xmm1 -+ .byte 0x66, 0x0f, 0x38, 0xdd, 0xcd -+ # aesenclast %xmm6,%xmm2 -+ .byte 0x66, 0x0f, 0x38, 0xdd, 0xd6 -+ # aesenclast %xmm7,%xmm3 -+ .byte 0x66, 0x0f, 0x38, 0xdd, 0xdf -+ -+ sub $16,%ecx -+ movups %xmm0,(%rdx) # store output -+ movups %xmm1,0x200(%rdx) -+ movups %xmm2,0x400(%rdx) -+ movups %xmm3,0x600(%rdx) -+ add $16,%rdx -+ cmp $16,%ecx -+ jge .Lcbc_enc_loop4 -+ emms -+ ret -+ -+#if defined(__ELF__) && defined(SECTION_NOTE_GNU_STACK) -+ .section .note.GNU-stack,"",@progbits -+#endif -diff -urN linux-3.10-noloop/drivers/misc/aes-x86.S linux-3.10-AES/drivers/misc/aes-x86.S ---- linux-3.10-noloop/drivers/misc/aes-x86.S 1970-01-01 02:00:00.000000000 +0200 -+++ linux-3.10-AES/drivers/misc/aes-x86.S 2013-07-01 16:12:48.000000000 +0300 -@@ -0,0 +1,927 @@ -+// -+// Copyright (c) 2001, Dr Brian Gladman , Worcester, UK. -+// All rights reserved. -+// -+// TERMS -+// -+// Redistribution and use in source and binary forms, with or without -+// modification, are permitted subject to the following conditions: -+// -+// 1. Redistributions of source code must retain the above copyright -+// notice, this list of conditions and the following disclaimer. -+// -+// 2. Redistributions in binary form must reproduce the above copyright -+// notice, this list of conditions and the following disclaimer in the -+// documentation and/or other materials provided with the distribution. -+// -+// 3. The copyright holder's name must not be used to endorse or promote -+// any products derived from this software without his specific prior -+// written permission. -+// -+// This software is provided 'as is' with no express or implied warranties -+// of correctness or fitness for purpose. -+ -+// Modified by Jari Ruusu, December 24 2001 -+// - Converted syntax to GNU CPP/assembler syntax -+// - C programming interface converted back to "old" API -+// - Minor portability cleanups and speed optimizations -+ -+// Modified by Jari Ruusu, April 11 2002 -+// - Added above copyright and terms to resulting object code so that -+// binary distributions can avoid legal trouble -+ -+// An AES (Rijndael) implementation for x86 compatible processors. This -+// version uses i386 instruction set but instruction scheduling is optimized -+// for Pentium-2. This version only implements the standard AES block length -+// (128 bits, 16 bytes). This code does not preserve the eax, ecx or edx -+// registers or the artihmetic status flags. However, the ebx, esi, edi, and -+// ebp registers are preserved across calls. -+ -+// void aes_set_key(aes_context *cx, const unsigned char key[], const int key_len, const int f) -+// void aes_encrypt(const aes_context *cx, const unsigned char in_blk[], unsigned char out_blk[]) -+// void aes_decrypt(const aes_context *cx, const unsigned char in_blk[], unsigned char out_blk[]) -+ -+#if defined(USE_UNDERLINE) -+# define aes_set_key _aes_set_key -+# define aes_encrypt _aes_encrypt -+# define aes_decrypt _aes_decrypt -+#endif -+#if !defined(ALIGN32BYTES) -+# define ALIGN32BYTES 32 -+#endif -+ -+ .file "aes-x86.S" -+ .globl aes_set_key -+ .globl aes_encrypt -+ .globl aes_decrypt -+ -+ .text -+copyright: -+ .ascii " \000" -+ .ascii "Copyright (c) 2001, Dr Brian Gladman , Worcester, UK.\000" -+ .ascii "All rights reserved.\000" -+ .ascii " \000" -+ .ascii "TERMS\000" -+ .ascii " \000" -+ .ascii " Redistribution and use in source and binary forms, with or without\000" -+ .ascii " modification, are permitted subject to the following conditions:\000" -+ .ascii " \000" -+ .ascii " 1. Redistributions of source code must retain the above copyright\000" -+ .ascii " notice, this list of conditions and the following disclaimer.\000" -+ .ascii " \000" -+ .ascii " 2. Redistributions in binary form must reproduce the above copyright\000" -+ .ascii " notice, this list of conditions and the following disclaimer in the\000" -+ .ascii " documentation and/or other materials provided with the distribution.\000" -+ .ascii " \000" -+ .ascii " 3. The copyright holder's name must not be used to endorse or promote\000" -+ .ascii " any products derived from this software without his specific prior\000" -+ .ascii " written permission.\000" -+ .ascii " \000" -+ .ascii " This software is provided 'as is' with no express or implied warranties\000" -+ .ascii " of correctness or fitness for purpose.\000" -+ .ascii " \000" -+ -+#define tlen 1024 // length of each of 4 'xor' arrays (256 32-bit words) -+ -+// offsets to parameters with one register pushed onto stack -+ -+#define ctx 8 // AES context structure -+#define in_blk 12 // input byte array address parameter -+#define out_blk 16 // output byte array address parameter -+ -+// offsets in context structure -+ -+#define nkey 0 // key length, size 4 -+#define nrnd 4 // number of rounds, size 4 -+#define ekey 8 // encryption key schedule base address, size 256 -+#define dkey 264 // decryption key schedule base address, size 256 -+ -+// This macro performs a forward encryption cycle. It is entered with -+// the first previous round column values in %eax, %ebx, %esi and %edi and -+// exits with the final values in the same registers. -+ -+#define fwd_rnd(p1,p2) \ -+ mov %ebx,(%esp) ;\ -+ movzbl %al,%edx ;\ -+ mov %eax,%ecx ;\ -+ mov p2(%ebp),%eax ;\ -+ mov %edi,4(%esp) ;\ -+ mov p2+12(%ebp),%edi ;\ -+ xor p1(,%edx,4),%eax ;\ -+ movzbl %ch,%edx ;\ -+ shr $16,%ecx ;\ -+ mov p2+4(%ebp),%ebx ;\ -+ xor p1+tlen(,%edx,4),%edi ;\ -+ movzbl %cl,%edx ;\ -+ movzbl %ch,%ecx ;\ -+ xor p1+3*tlen(,%ecx,4),%ebx ;\ -+ mov %esi,%ecx ;\ -+ mov p1+2*tlen(,%edx,4),%esi ;\ -+ movzbl %cl,%edx ;\ -+ xor p1(,%edx,4),%esi ;\ -+ movzbl %ch,%edx ;\ -+ shr $16,%ecx ;\ -+ xor p1+tlen(,%edx,4),%ebx ;\ -+ movzbl %cl,%edx ;\ -+ movzbl %ch,%ecx ;\ -+ xor p1+2*tlen(,%edx,4),%eax ;\ -+ mov (%esp),%edx ;\ -+ xor p1+3*tlen(,%ecx,4),%edi ;\ -+ movzbl %dl,%ecx ;\ -+ xor p2+8(%ebp),%esi ;\ -+ xor p1(,%ecx,4),%ebx ;\ -+ movzbl %dh,%ecx ;\ -+ shr $16,%edx ;\ -+ xor p1+tlen(,%ecx,4),%eax ;\ -+ movzbl %dl,%ecx ;\ -+ movzbl %dh,%edx ;\ -+ xor p1+2*tlen(,%ecx,4),%edi ;\ -+ mov 4(%esp),%ecx ;\ -+ xor p1+3*tlen(,%edx,4),%esi ;\ -+ movzbl %cl,%edx ;\ -+ xor p1(,%edx,4),%edi ;\ -+ movzbl %ch,%edx ;\ -+ shr $16,%ecx ;\ -+ xor p1+tlen(,%edx,4),%esi ;\ -+ movzbl %cl,%edx ;\ -+ movzbl %ch,%ecx ;\ -+ xor p1+2*tlen(,%edx,4),%ebx ;\ -+ xor p1+3*tlen(,%ecx,4),%eax -+ -+// This macro performs an inverse encryption cycle. It is entered with -+// the first previous round column values in %eax, %ebx, %esi and %edi and -+// exits with the final values in the same registers. -+ -+#define inv_rnd(p1,p2) \ -+ movzbl %al,%edx ;\ -+ mov %ebx,(%esp) ;\ -+ mov %eax,%ecx ;\ -+ mov p2(%ebp),%eax ;\ -+ mov %edi,4(%esp) ;\ -+ mov p2+4(%ebp),%ebx ;\ -+ xor p1(,%edx,4),%eax ;\ -+ movzbl %ch,%edx ;\ -+ shr $16,%ecx ;\ -+ mov p2+12(%ebp),%edi ;\ -+ xor p1+tlen(,%edx,4),%ebx ;\ -+ movzbl %cl,%edx ;\ -+ movzbl %ch,%ecx ;\ -+ xor p1+3*tlen(,%ecx,4),%edi ;\ -+ mov %esi,%ecx ;\ -+ mov p1+2*tlen(,%edx,4),%esi ;\ -+ movzbl %cl,%edx ;\ -+ xor p1(,%edx,4),%esi ;\ -+ movzbl %ch,%edx ;\ -+ shr $16,%ecx ;\ -+ xor p1+tlen(,%edx,4),%edi ;\ -+ movzbl %cl,%edx ;\ -+ movzbl %ch,%ecx ;\ -+ xor p1+2*tlen(,%edx,4),%eax ;\ -+ mov (%esp),%edx ;\ -+ xor p1+3*tlen(,%ecx,4),%ebx ;\ -+ movzbl %dl,%ecx ;\ -+ xor p2+8(%ebp),%esi ;\ -+ xor p1(,%ecx,4),%ebx ;\ -+ movzbl %dh,%ecx ;\ -+ shr $16,%edx ;\ -+ xor p1+tlen(,%ecx,4),%esi ;\ -+ movzbl %dl,%ecx ;\ -+ movzbl %dh,%edx ;\ -+ xor p1+2*tlen(,%ecx,4),%edi ;\ -+ mov 4(%esp),%ecx ;\ -+ xor p1+3*tlen(,%edx,4),%eax ;\ -+ movzbl %cl,%edx ;\ -+ xor p1(,%edx,4),%edi ;\ -+ movzbl %ch,%edx ;\ -+ shr $16,%ecx ;\ -+ xor p1+tlen(,%edx,4),%eax ;\ -+ movzbl %cl,%edx ;\ -+ movzbl %ch,%ecx ;\ -+ xor p1+2*tlen(,%edx,4),%ebx ;\ -+ xor p1+3*tlen(,%ecx,4),%esi -+ -+// AES (Rijndael) Encryption Subroutine -+ -+ .text -+ .align ALIGN32BYTES -+aes_encrypt: -+ push %ebp -+ mov ctx(%esp),%ebp // pointer to context -+ mov in_blk(%esp),%ecx -+ push %ebx -+ push %esi -+ push %edi -+ mov nrnd(%ebp),%edx // number of rounds -+ lea ekey+16(%ebp),%ebp // key pointer -+ -+// input four columns and xor in first round key -+ -+ mov (%ecx),%eax -+ mov 4(%ecx),%ebx -+ mov 8(%ecx),%esi -+ mov 12(%ecx),%edi -+ xor -16(%ebp),%eax -+ xor -12(%ebp),%ebx -+ xor -8(%ebp),%esi -+ xor -4(%ebp),%edi -+ -+ sub $8,%esp // space for register saves on stack -+ -+ sub $10,%edx -+ je aes_15 -+ add $32,%ebp -+ sub $2,%edx -+ je aes_13 -+ add $32,%ebp -+ -+ fwd_rnd(aes_ft_tab,-64) // 14 rounds for 256-bit key -+ fwd_rnd(aes_ft_tab,-48) -+aes_13: fwd_rnd(aes_ft_tab,-32) // 12 rounds for 192-bit key -+ fwd_rnd(aes_ft_tab,-16) -+aes_15: fwd_rnd(aes_ft_tab,0) // 10 rounds for 128-bit key -+ fwd_rnd(aes_ft_tab,16) -+ fwd_rnd(aes_ft_tab,32) -+ fwd_rnd(aes_ft_tab,48) -+ fwd_rnd(aes_ft_tab,64) -+ fwd_rnd(aes_ft_tab,80) -+ fwd_rnd(aes_ft_tab,96) -+ fwd_rnd(aes_ft_tab,112) -+ fwd_rnd(aes_ft_tab,128) -+ fwd_rnd(aes_fl_tab,144) // last round uses a different table -+ -+// move final values to the output array. -+ -+ mov out_blk+20(%esp),%ebp -+ add $8,%esp -+ mov %eax,(%ebp) -+ mov %ebx,4(%ebp) -+ mov %esi,8(%ebp) -+ mov %edi,12(%ebp) -+ pop %edi -+ pop %esi -+ pop %ebx -+ pop %ebp -+ ret -+ -+ -+// AES (Rijndael) Decryption Subroutine -+ -+ .align ALIGN32BYTES -+aes_decrypt: -+ push %ebp -+ mov ctx(%esp),%ebp // pointer to context -+ mov in_blk(%esp),%ecx -+ push %ebx -+ push %esi -+ push %edi -+ mov nrnd(%ebp),%edx // number of rounds -+ lea dkey+16(%ebp),%ebp // key pointer -+ -+// input four columns and xor in first round key -+ -+ mov (%ecx),%eax -+ mov 4(%ecx),%ebx -+ mov 8(%ecx),%esi -+ mov 12(%ecx),%edi -+ xor -16(%ebp),%eax -+ xor -12(%ebp),%ebx -+ xor -8(%ebp),%esi -+ xor -4(%ebp),%edi -+ -+ sub $8,%esp // space for register saves on stack -+ -+ sub $10,%edx -+ je aes_25 -+ add $32,%ebp -+ sub $2,%edx -+ je aes_23 -+ add $32,%ebp -+ -+ inv_rnd(aes_it_tab,-64) // 14 rounds for 256-bit key -+ inv_rnd(aes_it_tab,-48) -+aes_23: inv_rnd(aes_it_tab,-32) // 12 rounds for 192-bit key -+ inv_rnd(aes_it_tab,-16) -+aes_25: inv_rnd(aes_it_tab,0) // 10 rounds for 128-bit key -+ inv_rnd(aes_it_tab,16) -+ inv_rnd(aes_it_tab,32) -+ inv_rnd(aes_it_tab,48) -+ inv_rnd(aes_it_tab,64) -+ inv_rnd(aes_it_tab,80) -+ inv_rnd(aes_it_tab,96) -+ inv_rnd(aes_it_tab,112) -+ inv_rnd(aes_it_tab,128) -+ inv_rnd(aes_il_tab,144) // last round uses a different table -+ -+// move final values to the output array. -+ -+ mov out_blk+20(%esp),%ebp -+ add $8,%esp -+ mov %eax,(%ebp) -+ mov %ebx,4(%ebp) -+ mov %esi,8(%ebp) -+ mov %edi,12(%ebp) -+ pop %edi -+ pop %esi -+ pop %ebx -+ pop %ebp -+ ret -+ -+// AES (Rijndael) Key Schedule Subroutine -+ -+// input/output parameters -+ -+#define aes_cx 12 // AES context -+#define in_key 16 // key input array address -+#define key_ln 20 // key length, bytes (16,24,32) or bits (128,192,256) -+#define ed_flg 24 // 0=create both encr/decr keys, 1=create encr key only -+ -+// offsets for locals -+ -+#define cnt -4 -+#define slen 8 -+ -+// This macro performs a column mixing operation on an input 32-bit -+// word to give a 32-bit result. It uses each of the 4 bytes in the -+// the input column to index 4 different tables of 256 32-bit words -+// that are xored together to form the output value. -+ -+#define mix_col(p1) \ -+ movzbl %bl,%ecx ;\ -+ mov p1(,%ecx,4),%eax ;\ -+ movzbl %bh,%ecx ;\ -+ ror $16,%ebx ;\ -+ xor p1+tlen(,%ecx,4),%eax ;\ -+ movzbl %bl,%ecx ;\ -+ xor p1+2*tlen(,%ecx,4),%eax ;\ -+ movzbl %bh,%ecx ;\ -+ xor p1+3*tlen(,%ecx,4),%eax -+ -+// Key Schedule Macros -+ -+#define ksc4(p1) \ -+ rol $24,%ebx ;\ -+ mix_col(aes_fl_tab) ;\ -+ ror $8,%ebx ;\ -+ xor 4*p1+aes_rcon_tab,%eax ;\ -+ xor %eax,%esi ;\ -+ xor %esi,%ebp ;\ -+ mov %esi,16*p1(%edi) ;\ -+ mov %ebp,16*p1+4(%edi) ;\ -+ xor %ebp,%edx ;\ -+ xor %edx,%ebx ;\ -+ mov %edx,16*p1+8(%edi) ;\ -+ mov %ebx,16*p1+12(%edi) -+ -+#define ksc6(p1) \ -+ rol $24,%ebx ;\ -+ mix_col(aes_fl_tab) ;\ -+ ror $8,%ebx ;\ -+ xor 4*p1+aes_rcon_tab,%eax ;\ -+ xor 24*p1-24(%edi),%eax ;\ -+ mov %eax,24*p1(%edi) ;\ -+ xor 24*p1-20(%edi),%eax ;\ -+ mov %eax,24*p1+4(%edi) ;\ -+ xor %eax,%esi ;\ -+ xor %esi,%ebp ;\ -+ mov %esi,24*p1+8(%edi) ;\ -+ mov %ebp,24*p1+12(%edi) ;\ -+ xor %ebp,%edx ;\ -+ xor %edx,%ebx ;\ -+ mov %edx,24*p1+16(%edi) ;\ -+ mov %ebx,24*p1+20(%edi) -+ -+#define ksc8(p1) \ -+ rol $24,%ebx ;\ -+ mix_col(aes_fl_tab) ;\ -+ ror $8,%ebx ;\ -+ xor 4*p1+aes_rcon_tab,%eax ;\ -+ xor 32*p1-32(%edi),%eax ;\ -+ mov %eax,32*p1(%edi) ;\ -+ xor 32*p1-28(%edi),%eax ;\ -+ mov %eax,32*p1+4(%edi) ;\ -+ xor 32*p1-24(%edi),%eax ;\ -+ mov %eax,32*p1+8(%edi) ;\ -+ xor 32*p1-20(%edi),%eax ;\ -+ mov %eax,32*p1+12(%edi) ;\ -+ push %ebx ;\ -+ mov %eax,%ebx ;\ -+ mix_col(aes_fl_tab) ;\ -+ pop %ebx ;\ -+ xor %eax,%esi ;\ -+ xor %esi,%ebp ;\ -+ mov %esi,32*p1+16(%edi) ;\ -+ mov %ebp,32*p1+20(%edi) ;\ -+ xor %ebp,%edx ;\ -+ xor %edx,%ebx ;\ -+ mov %edx,32*p1+24(%edi) ;\ -+ mov %ebx,32*p1+28(%edi) -+ -+ .align ALIGN32BYTES -+aes_set_key: -+ pushfl -+ push %ebp -+ mov %esp,%ebp -+ sub $slen,%esp -+ push %ebx -+ push %esi -+ push %edi -+ -+ mov aes_cx(%ebp),%edx // edx -> AES context -+ -+ mov key_ln(%ebp),%ecx // key length -+ cmpl $128,%ecx -+ jb aes_30 -+ shr $3,%ecx -+aes_30: cmpl $32,%ecx -+ je aes_32 -+ cmpl $24,%ecx -+ je aes_32 -+ mov $16,%ecx -+aes_32: shr $2,%ecx -+ mov %ecx,nkey(%edx) -+ -+ lea 6(%ecx),%eax // 10/12/14 for 4/6/8 32-bit key length -+ mov %eax,nrnd(%edx) -+ -+ mov in_key(%ebp),%esi // key input array -+ lea ekey(%edx),%edi // key position in AES context -+ cld -+ push %ebp -+ mov %ecx,%eax // save key length in eax -+ rep ; movsl // words in the key schedule -+ mov -4(%esi),%ebx // put some values in registers -+ mov -8(%esi),%edx // to allow faster code -+ mov -12(%esi),%ebp -+ mov -16(%esi),%esi -+ -+ cmpl $4,%eax // jump on key size -+ je aes_36 -+ cmpl $6,%eax -+ je aes_35 -+ -+ ksc8(0) -+ ksc8(1) -+ ksc8(2) -+ ksc8(3) -+ ksc8(4) -+ ksc8(5) -+ ksc8(6) -+ jmp aes_37 -+aes_35: ksc6(0) -+ ksc6(1) -+ ksc6(2) -+ ksc6(3) -+ ksc6(4) -+ ksc6(5) -+ ksc6(6) -+ ksc6(7) -+ jmp aes_37 -+aes_36: ksc4(0) -+ ksc4(1) -+ ksc4(2) -+ ksc4(3) -+ ksc4(4) -+ ksc4(5) -+ ksc4(6) -+ ksc4(7) -+ ksc4(8) -+ ksc4(9) -+aes_37: pop %ebp -+ mov aes_cx(%ebp),%edx // edx -> AES context -+ cmpl $0,ed_flg(%ebp) -+ jne aes_39 -+ -+// compile decryption key schedule from encryption schedule - reverse -+// order and do mix_column operation on round keys except first and last -+ -+ mov nrnd(%edx),%eax // kt = cx->d_key + nc * cx->Nrnd -+ shl $2,%eax -+ lea dkey(%edx,%eax,4),%edi -+ lea ekey(%edx),%esi // kf = cx->e_key -+ -+ movsl // copy first round key (unmodified) -+ movsl -+ movsl -+ movsl -+ sub $32,%edi -+ movl $1,cnt(%ebp) -+aes_38: // do mix column on each column of -+ lodsl // each round key -+ mov %eax,%ebx -+ mix_col(aes_im_tab) -+ stosl -+ lodsl -+ mov %eax,%ebx -+ mix_col(aes_im_tab) -+ stosl -+ lodsl -+ mov %eax,%ebx -+ mix_col(aes_im_tab) -+ stosl -+ lodsl -+ mov %eax,%ebx -+ mix_col(aes_im_tab) -+ stosl -+ sub $32,%edi -+ -+ incl cnt(%ebp) -+ mov cnt(%ebp),%eax -+ cmp nrnd(%edx),%eax -+ jb aes_38 -+ -+ movsl // copy last round key (unmodified) -+ movsl -+ movsl -+ movsl -+aes_39: pop %edi -+ pop %esi -+ pop %ebx -+ mov %ebp,%esp -+ pop %ebp -+ popfl -+ ret -+ -+ -+// finite field multiplies by {02}, {04} and {08} -+ -+#define f2(x) ((x<<1)^(((x>>7)&1)*0x11b)) -+#define f4(x) ((x<<2)^(((x>>6)&1)*0x11b)^(((x>>6)&2)*0x11b)) -+#define f8(x) ((x<<3)^(((x>>5)&1)*0x11b)^(((x>>5)&2)*0x11b)^(((x>>5)&4)*0x11b)) -+ -+// finite field multiplies required in table generation -+ -+#define f3(x) (f2(x) ^ x) -+#define f9(x) (f8(x) ^ x) -+#define fb(x) (f8(x) ^ f2(x) ^ x) -+#define fd(x) (f8(x) ^ f4(x) ^ x) -+#define fe(x) (f8(x) ^ f4(x) ^ f2(x)) -+ -+// These defines generate the forward table entries -+ -+#define u0(x) ((f3(x) << 24) | (x << 16) | (x << 8) | f2(x)) -+#define u1(x) ((x << 24) | (x << 16) | (f2(x) << 8) | f3(x)) -+#define u2(x) ((x << 24) | (f2(x) << 16) | (f3(x) << 8) | x) -+#define u3(x) ((f2(x) << 24) | (f3(x) << 16) | (x << 8) | x) -+ -+// These defines generate the inverse table entries -+ -+#define v0(x) ((fb(x) << 24) | (fd(x) << 16) | (f9(x) << 8) | fe(x)) -+#define v1(x) ((fd(x) << 24) | (f9(x) << 16) | (fe(x) << 8) | fb(x)) -+#define v2(x) ((f9(x) << 24) | (fe(x) << 16) | (fb(x) << 8) | fd(x)) -+#define v3(x) ((fe(x) << 24) | (fb(x) << 16) | (fd(x) << 8) | f9(x)) -+ -+// These defines generate entries for the last round tables -+ -+#define w0(x) (x) -+#define w1(x) (x << 8) -+#define w2(x) (x << 16) -+#define w3(x) (x << 24) -+ -+// macro to generate inverse mix column tables (needed for the key schedule) -+ -+#define im_data0(p1) \ -+ .long p1(0x00),p1(0x01),p1(0x02),p1(0x03),p1(0x04),p1(0x05),p1(0x06),p1(0x07) ;\ -+ .long p1(0x08),p1(0x09),p1(0x0a),p1(0x0b),p1(0x0c),p1(0x0d),p1(0x0e),p1(0x0f) ;\ -+ .long p1(0x10),p1(0x11),p1(0x12),p1(0x13),p1(0x14),p1(0x15),p1(0x16),p1(0x17) ;\ -+ .long p1(0x18),p1(0x19),p1(0x1a),p1(0x1b),p1(0x1c),p1(0x1d),p1(0x1e),p1(0x1f) -+#define im_data1(p1) \ -+ .long p1(0x20),p1(0x21),p1(0x22),p1(0x23),p1(0x24),p1(0x25),p1(0x26),p1(0x27) ;\ -+ .long p1(0x28),p1(0x29),p1(0x2a),p1(0x2b),p1(0x2c),p1(0x2d),p1(0x2e),p1(0x2f) ;\ -+ .long p1(0x30),p1(0x31),p1(0x32),p1(0x33),p1(0x34),p1(0x35),p1(0x36),p1(0x37) ;\ -+ .long p1(0x38),p1(0x39),p1(0x3a),p1(0x3b),p1(0x3c),p1(0x3d),p1(0x3e),p1(0x3f) -+#define im_data2(p1) \ -+ .long p1(0x40),p1(0x41),p1(0x42),p1(0x43),p1(0x44),p1(0x45),p1(0x46),p1(0x47) ;\ -+ .long p1(0x48),p1(0x49),p1(0x4a),p1(0x4b),p1(0x4c),p1(0x4d),p1(0x4e),p1(0x4f) ;\ -+ .long p1(0x50),p1(0x51),p1(0x52),p1(0x53),p1(0x54),p1(0x55),p1(0x56),p1(0x57) ;\ -+ .long p1(0x58),p1(0x59),p1(0x5a),p1(0x5b),p1(0x5c),p1(0x5d),p1(0x5e),p1(0x5f) -+#define im_data3(p1) \ -+ .long p1(0x60),p1(0x61),p1(0x62),p1(0x63),p1(0x64),p1(0x65),p1(0x66),p1(0x67) ;\ -+ .long p1(0x68),p1(0x69),p1(0x6a),p1(0x6b),p1(0x6c),p1(0x6d),p1(0x6e),p1(0x6f) ;\ -+ .long p1(0x70),p1(0x71),p1(0x72),p1(0x73),p1(0x74),p1(0x75),p1(0x76),p1(0x77) ;\ -+ .long p1(0x78),p1(0x79),p1(0x7a),p1(0x7b),p1(0x7c),p1(0x7d),p1(0x7e),p1(0x7f) -+#define im_data4(p1) \ -+ .long p1(0x80),p1(0x81),p1(0x82),p1(0x83),p1(0x84),p1(0x85),p1(0x86),p1(0x87) ;\ -+ .long p1(0x88),p1(0x89),p1(0x8a),p1(0x8b),p1(0x8c),p1(0x8d),p1(0x8e),p1(0x8f) ;\ -+ .long p1(0x90),p1(0x91),p1(0x92),p1(0x93),p1(0x94),p1(0x95),p1(0x96),p1(0x97) ;\ -+ .long p1(0x98),p1(0x99),p1(0x9a),p1(0x9b),p1(0x9c),p1(0x9d),p1(0x9e),p1(0x9f) -+#define im_data5(p1) \ -+ .long p1(0xa0),p1(0xa1),p1(0xa2),p1(0xa3),p1(0xa4),p1(0xa5),p1(0xa6),p1(0xa7) ;\ -+ .long p1(0xa8),p1(0xa9),p1(0xaa),p1(0xab),p1(0xac),p1(0xad),p1(0xae),p1(0xaf) ;\ -+ .long p1(0xb0),p1(0xb1),p1(0xb2),p1(0xb3),p1(0xb4),p1(0xb5),p1(0xb6),p1(0xb7) ;\ -+ .long p1(0xb8),p1(0xb9),p1(0xba),p1(0xbb),p1(0xbc),p1(0xbd),p1(0xbe),p1(0xbf) -+#define im_data6(p1) \ -+ .long p1(0xc0),p1(0xc1),p1(0xc2),p1(0xc3),p1(0xc4),p1(0xc5),p1(0xc6),p1(0xc7) ;\ -+ .long p1(0xc8),p1(0xc9),p1(0xca),p1(0xcb),p1(0xcc),p1(0xcd),p1(0xce),p1(0xcf) ;\ -+ .long p1(0xd0),p1(0xd1),p1(0xd2),p1(0xd3),p1(0xd4),p1(0xd5),p1(0xd6),p1(0xd7) ;\ -+ .long p1(0xd8),p1(0xd9),p1(0xda),p1(0xdb),p1(0xdc),p1(0xdd),p1(0xde),p1(0xdf) -+#define im_data7(p1) \ -+ .long p1(0xe0),p1(0xe1),p1(0xe2),p1(0xe3),p1(0xe4),p1(0xe5),p1(0xe6),p1(0xe7) ;\ -+ .long p1(0xe8),p1(0xe9),p1(0xea),p1(0xeb),p1(0xec),p1(0xed),p1(0xee),p1(0xef) ;\ -+ .long p1(0xf0),p1(0xf1),p1(0xf2),p1(0xf3),p1(0xf4),p1(0xf5),p1(0xf6),p1(0xf7) ;\ -+ .long p1(0xf8),p1(0xf9),p1(0xfa),p1(0xfb),p1(0xfc),p1(0xfd),p1(0xfe),p1(0xff) -+ -+// S-box data - 256 entries -+ -+#define sb_data0(p1) \ -+ .long p1(0x63),p1(0x7c),p1(0x77),p1(0x7b),p1(0xf2),p1(0x6b),p1(0x6f),p1(0xc5) ;\ -+ .long p1(0x30),p1(0x01),p1(0x67),p1(0x2b),p1(0xfe),p1(0xd7),p1(0xab),p1(0x76) ;\ -+ .long p1(0xca),p1(0x82),p1(0xc9),p1(0x7d),p1(0xfa),p1(0x59),p1(0x47),p1(0xf0) ;\ -+ .long p1(0xad),p1(0xd4),p1(0xa2),p1(0xaf),p1(0x9c),p1(0xa4),p1(0x72),p1(0xc0) -+#define sb_data1(p1) \ -+ .long p1(0xb7),p1(0xfd),p1(0x93),p1(0x26),p1(0x36),p1(0x3f),p1(0xf7),p1(0xcc) ;\ -+ .long p1(0x34),p1(0xa5),p1(0xe5),p1(0xf1),p1(0x71),p1(0xd8),p1(0x31),p1(0x15) ;\ -+ .long p1(0x04),p1(0xc7),p1(0x23),p1(0xc3),p1(0x18),p1(0x96),p1(0x05),p1(0x9a) ;\ -+ .long p1(0x07),p1(0x12),p1(0x80),p1(0xe2),p1(0xeb),p1(0x27),p1(0xb2),p1(0x75) -+#define sb_data2(p1) \ -+ .long p1(0x09),p1(0x83),p1(0x2c),p1(0x1a),p1(0x1b),p1(0x6e),p1(0x5a),p1(0xa0) ;\ -+ .long p1(0x52),p1(0x3b),p1(0xd6),p1(0xb3),p1(0x29),p1(0xe3),p1(0x2f),p1(0x84) ;\ -+ .long p1(0x53),p1(0xd1),p1(0x00),p1(0xed),p1(0x20),p1(0xfc),p1(0xb1),p1(0x5b) ;\ -+ .long p1(0x6a),p1(0xcb),p1(0xbe),p1(0x39),p1(0x4a),p1(0x4c),p1(0x58),p1(0xcf) -+#define sb_data3(p1) \ -+ .long p1(0xd0),p1(0xef),p1(0xaa),p1(0xfb),p1(0x43),p1(0x4d),p1(0x33),p1(0x85) ;\ -+ .long p1(0x45),p1(0xf9),p1(0x02),p1(0x7f),p1(0x50),p1(0x3c),p1(0x9f),p1(0xa8) ;\ -+ .long p1(0x51),p1(0xa3),p1(0x40),p1(0x8f),p1(0x92),p1(0x9d),p1(0x38),p1(0xf5) ;\ -+ .long p1(0xbc),p1(0xb6),p1(0xda),p1(0x21),p1(0x10),p1(0xff),p1(0xf3),p1(0xd2) -+#define sb_data4(p1) \ -+ .long p1(0xcd),p1(0x0c),p1(0x13),p1(0xec),p1(0x5f),p1(0x97),p1(0x44),p1(0x17) ;\ -+ .long p1(0xc4),p1(0xa7),p1(0x7e),p1(0x3d),p1(0x64),p1(0x5d),p1(0x19),p1(0x73) ;\ -+ .long p1(0x60),p1(0x81),p1(0x4f),p1(0xdc),p1(0x22),p1(0x2a),p1(0x90),p1(0x88) ;\ -+ .long p1(0x46),p1(0xee),p1(0xb8),p1(0x14),p1(0xde),p1(0x5e),p1(0x0b),p1(0xdb) -+#define sb_data5(p1) \ -+ .long p1(0xe0),p1(0x32),p1(0x3a),p1(0x0a),p1(0x49),p1(0x06),p1(0x24),p1(0x5c) ;\ -+ .long p1(0xc2),p1(0xd3),p1(0xac),p1(0x62),p1(0x91),p1(0x95),p1(0xe4),p1(0x79) ;\ -+ .long p1(0xe7),p1(0xc8),p1(0x37),p1(0x6d),p1(0x8d),p1(0xd5),p1(0x4e),p1(0xa9) ;\ -+ .long p1(0x6c),p1(0x56),p1(0xf4),p1(0xea),p1(0x65),p1(0x7a),p1(0xae),p1(0x08) -+#define sb_data6(p1) \ -+ .long p1(0xba),p1(0x78),p1(0x25),p1(0x2e),p1(0x1c),p1(0xa6),p1(0xb4),p1(0xc6) ;\ -+ .long p1(0xe8),p1(0xdd),p1(0x74),p1(0x1f),p1(0x4b),p1(0xbd),p1(0x8b),p1(0x8a) ;\ -+ .long p1(0x70),p1(0x3e),p1(0xb5),p1(0x66),p1(0x48),p1(0x03),p1(0xf6),p1(0x0e) ;\ -+ .long p1(0x61),p1(0x35),p1(0x57),p1(0xb9),p1(0x86),p1(0xc1),p1(0x1d),p1(0x9e) -+#define sb_data7(p1) \ -+ .long p1(0xe1),p1(0xf8),p1(0x98),p1(0x11),p1(0x69),p1(0xd9),p1(0x8e),p1(0x94) ;\ -+ .long p1(0x9b),p1(0x1e),p1(0x87),p1(0xe9),p1(0xce),p1(0x55),p1(0x28),p1(0xdf) ;\ -+ .long p1(0x8c),p1(0xa1),p1(0x89),p1(0x0d),p1(0xbf),p1(0xe6),p1(0x42),p1(0x68) ;\ -+ .long p1(0x41),p1(0x99),p1(0x2d),p1(0x0f),p1(0xb0),p1(0x54),p1(0xbb),p1(0x16) -+ -+// Inverse S-box data - 256 entries -+ -+#define ib_data0(p1) \ -+ .long p1(0x52),p1(0x09),p1(0x6a),p1(0xd5),p1(0x30),p1(0x36),p1(0xa5),p1(0x38) ;\ -+ .long p1(0xbf),p1(0x40),p1(0xa3),p1(0x9e),p1(0x81),p1(0xf3),p1(0xd7),p1(0xfb) ;\ -+ .long p1(0x7c),p1(0xe3),p1(0x39),p1(0x82),p1(0x9b),p1(0x2f),p1(0xff),p1(0x87) ;\ -+ .long p1(0x34),p1(0x8e),p1(0x43),p1(0x44),p1(0xc4),p1(0xde),p1(0xe9),p1(0xcb) -+#define ib_data1(p1) \ -+ .long p1(0x54),p1(0x7b),p1(0x94),p1(0x32),p1(0xa6),p1(0xc2),p1(0x23),p1(0x3d) ;\ -+ .long p1(0xee),p1(0x4c),p1(0x95),p1(0x0b),p1(0x42),p1(0xfa),p1(0xc3),p1(0x4e) ;\ -+ .long p1(0x08),p1(0x2e),p1(0xa1),p1(0x66),p1(0x28),p1(0xd9),p1(0x24),p1(0xb2) ;\ -+ .long p1(0x76),p1(0x5b),p1(0xa2),p1(0x49),p1(0x6d),p1(0x8b),p1(0xd1),p1(0x25) -+#define ib_data2(p1) \ -+ .long p1(0x72),p1(0xf8),p1(0xf6),p1(0x64),p1(0x86),p1(0x68),p1(0x98),p1(0x16) ;\ -+ .long p1(0xd4),p1(0xa4),p1(0x5c),p1(0xcc),p1(0x5d),p1(0x65),p1(0xb6),p1(0x92) ;\ -+ .long p1(0x6c),p1(0x70),p1(0x48),p1(0x50),p1(0xfd),p1(0xed),p1(0xb9),p1(0xda) ;\ -+ .long p1(0x5e),p1(0x15),p1(0x46),p1(0x57),p1(0xa7),p1(0x8d),p1(0x9d),p1(0x84) -+#define ib_data3(p1) \ -+ .long p1(0x90),p1(0xd8),p1(0xab),p1(0x00),p1(0x8c),p1(0xbc),p1(0xd3),p1(0x0a) ;\ -+ .long p1(0xf7),p1(0xe4),p1(0x58),p1(0x05),p1(0xb8),p1(0xb3),p1(0x45),p1(0x06) ;\ -+ .long p1(0xd0),p1(0x2c),p1(0x1e),p1(0x8f),p1(0xca),p1(0x3f),p1(0x0f),p1(0x02) ;\ -+ .long p1(0xc1),p1(0xaf),p1(0xbd),p1(0x03),p1(0x01),p1(0x13),p1(0x8a),p1(0x6b) -+#define ib_data4(p1) \ -+ .long p1(0x3a),p1(0x91),p1(0x11),p1(0x41),p1(0x4f),p1(0x67),p1(0xdc),p1(0xea) ;\ -+ .long p1(0x97),p1(0xf2),p1(0xcf),p1(0xce),p1(0xf0),p1(0xb4),p1(0xe6),p1(0x73) ;\ -+ .long p1(0x96),p1(0xac),p1(0x74),p1(0x22),p1(0xe7),p1(0xad),p1(0x35),p1(0x85) ;\ -+ .long p1(0xe2),p1(0xf9),p1(0x37),p1(0xe8),p1(0x1c),p1(0x75),p1(0xdf),p1(0x6e) -+#define ib_data5(p1) \ -+ .long p1(0x47),p1(0xf1),p1(0x1a),p1(0x71),p1(0x1d),p1(0x29),p1(0xc5),p1(0x89) ;\ -+ .long p1(0x6f),p1(0xb7),p1(0x62),p1(0x0e),p1(0xaa),p1(0x18),p1(0xbe),p1(0x1b) ;\ -+ .long p1(0xfc),p1(0x56),p1(0x3e),p1(0x4b),p1(0xc6),p1(0xd2),p1(0x79),p1(0x20) ;\ -+ .long p1(0x9a),p1(0xdb),p1(0xc0),p1(0xfe),p1(0x78),p1(0xcd),p1(0x5a),p1(0xf4) -+#define ib_data6(p1) \ -+ .long p1(0x1f),p1(0xdd),p1(0xa8),p1(0x33),p1(0x88),p1(0x07),p1(0xc7),p1(0x31) ;\ -+ .long p1(0xb1),p1(0x12),p1(0x10),p1(0x59),p1(0x27),p1(0x80),p1(0xec),p1(0x5f) ;\ -+ .long p1(0x60),p1(0x51),p1(0x7f),p1(0xa9),p1(0x19),p1(0xb5),p1(0x4a),p1(0x0d) ;\ -+ .long p1(0x2d),p1(0xe5),p1(0x7a),p1(0x9f),p1(0x93),p1(0xc9),p1(0x9c),p1(0xef) -+#define ib_data7(p1) \ -+ .long p1(0xa0),p1(0xe0),p1(0x3b),p1(0x4d),p1(0xae),p1(0x2a),p1(0xf5),p1(0xb0) ;\ -+ .long p1(0xc8),p1(0xeb),p1(0xbb),p1(0x3c),p1(0x83),p1(0x53),p1(0x99),p1(0x61) ;\ -+ .long p1(0x17),p1(0x2b),p1(0x04),p1(0x7e),p1(0xba),p1(0x77),p1(0xd6),p1(0x26) ;\ -+ .long p1(0xe1),p1(0x69),p1(0x14),p1(0x63),p1(0x55),p1(0x21),p1(0x0c),p1(0x7d) -+ -+// The rcon_table (needed for the key schedule) -+// -+// Here is original Dr Brian Gladman's source code: -+// _rcon_tab: -+// %assign x 1 -+// %rep 29 -+// dd x -+// %assign x f2(x) -+// %endrep -+// -+// Here is precomputed output (it's more portable this way): -+ -+ .section .rodata -+ .align ALIGN32BYTES -+aes_rcon_tab: -+ .long 0x01,0x02,0x04,0x08,0x10,0x20,0x40,0x80 -+ .long 0x1b,0x36,0x6c,0xd8,0xab,0x4d,0x9a,0x2f -+ .long 0x5e,0xbc,0x63,0xc6,0x97,0x35,0x6a,0xd4 -+ .long 0xb3,0x7d,0xfa,0xef,0xc5 -+ -+// The forward xor tables -+ -+ .align ALIGN32BYTES -+aes_ft_tab: -+ sb_data0(u0) -+ sb_data1(u0) -+ sb_data2(u0) -+ sb_data3(u0) -+ sb_data4(u0) -+ sb_data5(u0) -+ sb_data6(u0) -+ sb_data7(u0) -+ -+ sb_data0(u1) -+ sb_data1(u1) -+ sb_data2(u1) -+ sb_data3(u1) -+ sb_data4(u1) -+ sb_data5(u1) -+ sb_data6(u1) -+ sb_data7(u1) -+ -+ sb_data0(u2) -+ sb_data1(u2) -+ sb_data2(u2) -+ sb_data3(u2) -+ sb_data4(u2) -+ sb_data5(u2) -+ sb_data6(u2) -+ sb_data7(u2) -+ -+ sb_data0(u3) -+ sb_data1(u3) -+ sb_data2(u3) -+ sb_data3(u3) -+ sb_data4(u3) -+ sb_data5(u3) -+ sb_data6(u3) -+ sb_data7(u3) -+ -+ .align ALIGN32BYTES -+aes_fl_tab: -+ sb_data0(w0) -+ sb_data1(w0) -+ sb_data2(w0) -+ sb_data3(w0) -+ sb_data4(w0) -+ sb_data5(w0) -+ sb_data6(w0) -+ sb_data7(w0) -+ -+ sb_data0(w1) -+ sb_data1(w1) -+ sb_data2(w1) -+ sb_data3(w1) -+ sb_data4(w1) -+ sb_data5(w1) -+ sb_data6(w1) -+ sb_data7(w1) -+ -+ sb_data0(w2) -+ sb_data1(w2) -+ sb_data2(w2) -+ sb_data3(w2) -+ sb_data4(w2) -+ sb_data5(w2) -+ sb_data6(w2) -+ sb_data7(w2) -+ -+ sb_data0(w3) -+ sb_data1(w3) -+ sb_data2(w3) -+ sb_data3(w3) -+ sb_data4(w3) -+ sb_data5(w3) -+ sb_data6(w3) -+ sb_data7(w3) -+ -+// The inverse xor tables -+ -+ .align ALIGN32BYTES -+aes_it_tab: -+ ib_data0(v0) -+ ib_data1(v0) -+ ib_data2(v0) -+ ib_data3(v0) -+ ib_data4(v0) -+ ib_data5(v0) -+ ib_data6(v0) -+ ib_data7(v0) -+ -+ ib_data0(v1) -+ ib_data1(v1) -+ ib_data2(v1) -+ ib_data3(v1) -+ ib_data4(v1) -+ ib_data5(v1) -+ ib_data6(v1) -+ ib_data7(v1) -+ -+ ib_data0(v2) -+ ib_data1(v2) -+ ib_data2(v2) -+ ib_data3(v2) -+ ib_data4(v2) -+ ib_data5(v2) -+ ib_data6(v2) -+ ib_data7(v2) -+ -+ ib_data0(v3) -+ ib_data1(v3) -+ ib_data2(v3) -+ ib_data3(v3) -+ ib_data4(v3) -+ ib_data5(v3) -+ ib_data6(v3) -+ ib_data7(v3) -+ -+ .align ALIGN32BYTES -+aes_il_tab: -+ ib_data0(w0) -+ ib_data1(w0) -+ ib_data2(w0) -+ ib_data3(w0) -+ ib_data4(w0) -+ ib_data5(w0) -+ ib_data6(w0) -+ ib_data7(w0) -+ -+ ib_data0(w1) -+ ib_data1(w1) -+ ib_data2(w1) -+ ib_data3(w1) -+ ib_data4(w1) -+ ib_data5(w1) -+ ib_data6(w1) -+ ib_data7(w1) -+ -+ ib_data0(w2) -+ ib_data1(w2) -+ ib_data2(w2) -+ ib_data3(w2) -+ ib_data4(w2) -+ ib_data5(w2) -+ ib_data6(w2) -+ ib_data7(w2) -+ -+ ib_data0(w3) -+ ib_data1(w3) -+ ib_data2(w3) -+ ib_data3(w3) -+ ib_data4(w3) -+ ib_data5(w3) -+ ib_data6(w3) -+ ib_data7(w3) -+ -+// The inverse mix column tables -+ -+ .align ALIGN32BYTES -+aes_im_tab: -+ im_data0(v0) -+ im_data1(v0) -+ im_data2(v0) -+ im_data3(v0) -+ im_data4(v0) -+ im_data5(v0) -+ im_data6(v0) -+ im_data7(v0) -+ -+ im_data0(v1) -+ im_data1(v1) -+ im_data2(v1) -+ im_data3(v1) -+ im_data4(v1) -+ im_data5(v1) -+ im_data6(v1) -+ im_data7(v1) -+ -+ im_data0(v2) -+ im_data1(v2) -+ im_data2(v2) -+ im_data3(v2) -+ im_data4(v2) -+ im_data5(v2) -+ im_data6(v2) -+ im_data7(v2) -+ -+ im_data0(v3) -+ im_data1(v3) -+ im_data2(v3) -+ im_data3(v3) -+ im_data4(v3) -+ im_data5(v3) -+ im_data6(v3) -+ im_data7(v3) -+ -+#if defined(__ELF__) && defined(SECTION_NOTE_GNU_STACK) -+ .section .note.GNU-stack,"",@progbits -+#endif -diff -urN linux-3.10-noloop/drivers/misc/aes.c linux-3.10-AES/drivers/misc/aes.c ---- linux-3.10-noloop/drivers/misc/aes.c 1970-01-01 02:00:00.000000000 +0200 -+++ linux-3.10-AES/drivers/misc/aes.c 2013-07-01 16:12:48.000000000 +0300 -@@ -0,0 +1,1479 @@ -+// I retain copyright in this code but I encourage its free use provided -+// that I don't carry any responsibility for the results. I am especially -+// happy to see it used in free and open source software. If you do use -+// it I would appreciate an acknowledgement of its origin in the code or -+// the product that results and I would also appreciate knowing a little -+// about the use to which it is being put. I am grateful to Frank Yellin -+// for some ideas that are used in this implementation. -+// -+// Dr B. R. Gladman 6th April 2001. -+// -+// This is an implementation of the AES encryption algorithm (Rijndael) -+// designed by Joan Daemen and Vincent Rijmen. This version is designed -+// to provide both fixed and dynamic block and key lengths and can also -+// run with either big or little endian internal byte order (see aes.h). -+// It inputs block and key lengths in bytes with the legal values being -+// 16, 24 and 32. -+ -+/* -+ * Modified by Jari Ruusu, May 1 2001 -+ * - Fixed some compile warnings, code was ok but gcc warned anyway. -+ * - Changed basic types: byte -> unsigned char, word -> u_int32_t -+ * - Major name space cleanup: Names visible to outside now begin -+ * with "aes_" or "AES_". A lot of stuff moved from aes.h to aes.c -+ * - Removed C++ and DLL support as part of name space cleanup. -+ * - Eliminated unnecessary recomputation of tables. (actual bug fix) -+ * - Merged precomputed constant tables to aes.c file. -+ * - Removed data alignment restrictions for portability reasons. -+ * - Made block and key lengths accept bit count (128/192/256) -+ * as well byte count (16/24/32). -+ * - Removed all error checks. This change also eliminated the need -+ * to preinitialize the context struct to zero. -+ * - Removed some totally unused constants. -+ */ -+/* -+ * Modified by Jari Ruusu, April 21 2004 -+ * - Added back code that avoids byte swaps on big endian boxes. -+ */ -+ -+#include "aes.h" -+ -+// CONFIGURATION OPTIONS (see also aes.h) -+// -+// 1. Define UNROLL for full loop unrolling in encryption and decryption. -+// 2. Define PARTIAL_UNROLL to unroll two loops in encryption and decryption. -+// 3. Define FIXED_TABLES for compiled rather than dynamic tables. -+// 4. Define FF_TABLES to use tables for field multiplies and inverses. -+// Do not enable this without understanding stack space requirements. -+// 5. Define ARRAYS to use arrays to hold the local state block. If this -+// is not defined, individually declared 32-bit words are used. -+// 6. Define FAST_VARIABLE if a high speed variable block implementation -+// is needed (essentially three separate fixed block size code sequences) -+// 7. Define either ONE_TABLE or FOUR_TABLES for a fast table driven -+// version using 1 table (2 kbytes of table space) or 4 tables (8 -+// kbytes of table space) for higher speed. -+// 8. Define either ONE_LR_TABLE or FOUR_LR_TABLES for a further speed -+// increase by using tables for the last rounds but with more table -+// space (2 or 8 kbytes extra). -+// 9. If neither ONE_TABLE nor FOUR_TABLES is defined, a compact but -+// slower version is provided. -+// 10. If fast decryption key scheduling is needed define ONE_IM_TABLE -+// or FOUR_IM_TABLES for higher speed (2 or 8 kbytes extra). -+ -+#define UNROLL -+//#define PARTIAL_UNROLL -+ -+#define FIXED_TABLES -+//#define FF_TABLES -+//#define ARRAYS -+#define FAST_VARIABLE -+ -+//#define ONE_TABLE -+#define FOUR_TABLES -+ -+//#define ONE_LR_TABLE -+#define FOUR_LR_TABLES -+ -+//#define ONE_IM_TABLE -+#define FOUR_IM_TABLES -+ -+#if defined(UNROLL) && defined (PARTIAL_UNROLL) -+#error both UNROLL and PARTIAL_UNROLL are defined -+#endif -+ -+#if defined(ONE_TABLE) && defined (FOUR_TABLES) -+#error both ONE_TABLE and FOUR_TABLES are defined -+#endif -+ -+#if defined(ONE_LR_TABLE) && defined (FOUR_LR_TABLES) -+#error both ONE_LR_TABLE and FOUR_LR_TABLES are defined -+#endif -+ -+#if defined(ONE_IM_TABLE) && defined (FOUR_IM_TABLES) -+#error both ONE_IM_TABLE and FOUR_IM_TABLES are defined -+#endif -+ -+#if defined(AES_BLOCK_SIZE) && AES_BLOCK_SIZE != 16 && AES_BLOCK_SIZE != 24 && AES_BLOCK_SIZE != 32 -+#error an illegal block size has been specified -+#endif -+ -+/* INTERNAL_BYTE_ORDER: 0=unknown, 1=little endian, 2=big endian */ -+#if defined(INTERNAL_BYTE_ORDER) -+#elif defined(__i386__)||defined(__i386)||defined(__x86_64__)||defined(__x86_64)||defined(__amd64__)||defined(__amd64)||defined(__AMD64__)||defined(__AMD64) -+# define INTERNAL_BYTE_ORDER 1 -+# undef DATA_ALWAYS_ALIGNED -+# define DATA_ALWAYS_ALIGNED 1 /* unaligned access is always ok */ -+#elif defined(__ppc__)||defined(__ppc)||defined(__PPC__)||defined(__PPC)||defined(__powerpc__)||defined(__powerpc)||defined(__POWERPC__)||defined(__POWERPC)||defined(__PowerPC__)||defined(__PowerPC)||defined(__ppc64__)||defined(__ppc64)||defined(__PPC64__)||defined(__PPC64)||defined(__powerpc64__)||defined(__powerpc64)||defined(__s390__)||defined(__s390) -+# define INTERNAL_BYTE_ORDER 2 -+# undef DATA_ALWAYS_ALIGNED -+# define DATA_ALWAYS_ALIGNED 1 /* unaligned access is always ok */ -+#elif defined(__alpha__)||defined(__alpha)||defined(__ia64__)||defined(__ia64) -+# define INTERNAL_BYTE_ORDER 1 -+#elif defined(__hppa__)||defined(__hppa)||defined(__HPPA__)||defined(__HPPA)||defined(__parisc__)||defined(__parisc)||defined(__sparc__)||defined(__sparc)||defined(__sparc_v9__)||defined(__sparc_v9)||defined(__sparc64__)||defined(__sparc64)||defined(__mc68000__)||defined(__mc68000) -+# define INTERNAL_BYTE_ORDER 2 -+#elif defined(CONFIGURE_DETECTS_BYTE_ORDER) -+# if WORDS_BIGENDIAN -+# define INTERNAL_BYTE_ORDER 2 -+# else -+# define INTERNAL_BYTE_ORDER 1 -+# endif -+#elif defined(__linux__) && defined(__KERNEL__) -+# include -+# if defined(__BIG_ENDIAN) -+# define INTERNAL_BYTE_ORDER 2 -+# else -+# define INTERNAL_BYTE_ORDER 1 -+# endif -+#else -+# include -+# if (defined(BYTE_ORDER) && defined(LITTLE_ENDIAN) && (BYTE_ORDER == LITTLE_ENDIAN)) || (defined(__BYTE_ORDER) && defined(__LITTLE_ENDIAN) && (__BYTE_ORDER == __LITTLE_ENDIAN)) -+# define INTERNAL_BYTE_ORDER 1 -+# elif WORDS_BIGENDIAN || defined(__BIG_ENDIAN__) || (defined(BYTE_ORDER) && defined(BIG_ENDIAN) && (BYTE_ORDER == BIG_ENDIAN)) || (defined(__BYTE_ORDER) && defined(__BIG_ENDIAN) && (__BYTE_ORDER == __BIG_ENDIAN)) -+# define INTERNAL_BYTE_ORDER 2 -+# else -+# define INTERNAL_BYTE_ORDER 0 -+# endif -+#endif -+ -+#if defined(DATA_ALWAYS_ALIGNED) && (INTERNAL_BYTE_ORDER > 0) -+# define word_in(x) *(u_int32_t*)(x) -+# define word_out(x,v) *(u_int32_t*)(x) = (v) -+#elif defined(__linux__) && defined(__KERNEL__) -+# include -+# define word_in(x) get_unaligned((u_int32_t*)(x)) -+# define word_out(x,v) put_unaligned((v),(u_int32_t*)(x)) -+#else -+/* unknown endianness and/or unable to handle unaligned data */ -+# undef INTERNAL_BYTE_ORDER -+# define INTERNAL_BYTE_ORDER 1 -+# define word_in(x) ((u_int32_t)(((unsigned char *)(x))[0])|((u_int32_t)(((unsigned char *)(x))[1])<<8)|((u_int32_t)(((unsigned char *)(x))[2])<<16)|((u_int32_t)(((unsigned char *)(x))[3])<<24)) -+# define word_out(x,v) ((unsigned char *)(x))[0]=(v),((unsigned char *)(x))[1]=((v)>>8),((unsigned char *)(x))[2]=((v)>>16),((unsigned char *)(x))[3]=((v)>>24) -+#endif -+ -+// upr(x,n): rotates bytes within words by n positions, moving bytes -+// to higher index positions with wrap around into low positions -+// ups(x,n): moves bytes by n positions to higher index positions in -+// words but without wrap around -+// bval(x,n): extracts a byte from a word -+ -+#if (INTERNAL_BYTE_ORDER < 2) -+/* little endian */ -+#define upr(x,n) (((x) << 8 * (n)) | ((x) >> (32 - 8 * (n)))) -+#define ups(x,n) ((x) << 8 * (n)) -+#define bval(x,n) ((unsigned char)((x) >> 8 * (n))) -+#define bytes2word(b0, b1, b2, b3) \ -+ ((u_int32_t)(b3) << 24 | (u_int32_t)(b2) << 16 | (u_int32_t)(b1) << 8 | (b0)) -+#else -+/* big endian */ -+#define upr(x,n) (((x) >> 8 * (n)) | ((x) << (32 - 8 * (n)))) -+#define ups(x,n) ((x) >> 8 * (n))) -+#define bval(x,n) ((unsigned char)((x) >> (24 - 8 * (n)))) -+#define bytes2word(b0, b1, b2, b3) \ -+ ((u_int32_t)(b0) << 24 | (u_int32_t)(b1) << 16 | (u_int32_t)(b2) << 8 | (b3)) -+#endif -+ -+// Disable at least some poor combinations of options -+ -+#if !defined(ONE_TABLE) && !defined(FOUR_TABLES) -+#define FIXED_TABLES -+#undef UNROLL -+#undef ONE_LR_TABLE -+#undef FOUR_LR_TABLES -+#undef ONE_IM_TABLE -+#undef FOUR_IM_TABLES -+#elif !defined(FOUR_TABLES) -+#ifdef FOUR_LR_TABLES -+#undef FOUR_LR_TABLES -+#define ONE_LR_TABLE -+#endif -+#ifdef FOUR_IM_TABLES -+#undef FOUR_IM_TABLES -+#define ONE_IM_TABLE -+#endif -+#elif !defined(AES_BLOCK_SIZE) -+#if defined(UNROLL) -+#define PARTIAL_UNROLL -+#undef UNROLL -+#endif -+#endif -+ -+// the finite field modular polynomial and elements -+ -+#define ff_poly 0x011b -+#define ff_hi 0x80 -+ -+// multiply four bytes in GF(2^8) by 'x' {02} in parallel -+ -+#define m1 0x80808080 -+#define m2 0x7f7f7f7f -+#define m3 0x0000001b -+#define FFmulX(x) ((((x) & m2) << 1) ^ ((((x) & m1) >> 7) * m3)) -+ -+// The following defines provide alternative definitions of FFmulX that might -+// give improved performance if a fast 32-bit multiply is not available. Note -+// that a temporary variable u needs to be defined where FFmulX is used. -+ -+// #define FFmulX(x) (u = (x) & m1, u |= (u >> 1), ((x) & m2) << 1) ^ ((u >> 3) | (u >> 6)) -+// #define m4 0x1b1b1b1b -+// #define FFmulX(x) (u = (x) & m1, ((x) & m2) << 1) ^ ((u - (u >> 7)) & m4) -+ -+// perform column mix operation on four bytes in parallel -+ -+#define fwd_mcol(x) (f2 = FFmulX(x), f2 ^ upr(x ^ f2,3) ^ upr(x,2) ^ upr(x,1)) -+ -+#if defined(FIXED_TABLES) -+ -+// the S-Box table -+ -+static const unsigned char s_box[256] = -+{ -+ 0x63, 0x7c, 0x77, 0x7b, 0xf2, 0x6b, 0x6f, 0xc5, -+ 0x30, 0x01, 0x67, 0x2b, 0xfe, 0xd7, 0xab, 0x76, -+ 0xca, 0x82, 0xc9, 0x7d, 0xfa, 0x59, 0x47, 0xf0, -+ 0xad, 0xd4, 0xa2, 0xaf, 0x9c, 0xa4, 0x72, 0xc0, -+ 0xb7, 0xfd, 0x93, 0x26, 0x36, 0x3f, 0xf7, 0xcc, -+ 0x34, 0xa5, 0xe5, 0xf1, 0x71, 0xd8, 0x31, 0x15, -+ 0x04, 0xc7, 0x23, 0xc3, 0x18, 0x96, 0x05, 0x9a, -+ 0x07, 0x12, 0x80, 0xe2, 0xeb, 0x27, 0xb2, 0x75, -+ 0x09, 0x83, 0x2c, 0x1a, 0x1b, 0x6e, 0x5a, 0xa0, -+ 0x52, 0x3b, 0xd6, 0xb3, 0x29, 0xe3, 0x2f, 0x84, -+ 0x53, 0xd1, 0x00, 0xed, 0x20, 0xfc, 0xb1, 0x5b, -+ 0x6a, 0xcb, 0xbe, 0x39, 0x4a, 0x4c, 0x58, 0xcf, -+ 0xd0, 0xef, 0xaa, 0xfb, 0x43, 0x4d, 0x33, 0x85, -+ 0x45, 0xf9, 0x02, 0x7f, 0x50, 0x3c, 0x9f, 0xa8, -+ 0x51, 0xa3, 0x40, 0x8f, 0x92, 0x9d, 0x38, 0xf5, -+ 0xbc, 0xb6, 0xda, 0x21, 0x10, 0xff, 0xf3, 0xd2, -+ 0xcd, 0x0c, 0x13, 0xec, 0x5f, 0x97, 0x44, 0x17, -+ 0xc4, 0xa7, 0x7e, 0x3d, 0x64, 0x5d, 0x19, 0x73, -+ 0x60, 0x81, 0x4f, 0xdc, 0x22, 0x2a, 0x90, 0x88, -+ 0x46, 0xee, 0xb8, 0x14, 0xde, 0x5e, 0x0b, 0xdb, -+ 0xe0, 0x32, 0x3a, 0x0a, 0x49, 0x06, 0x24, 0x5c, -+ 0xc2, 0xd3, 0xac, 0x62, 0x91, 0x95, 0xe4, 0x79, -+ 0xe7, 0xc8, 0x37, 0x6d, 0x8d, 0xd5, 0x4e, 0xa9, -+ 0x6c, 0x56, 0xf4, 0xea, 0x65, 0x7a, 0xae, 0x08, -+ 0xba, 0x78, 0x25, 0x2e, 0x1c, 0xa6, 0xb4, 0xc6, -+ 0xe8, 0xdd, 0x74, 0x1f, 0x4b, 0xbd, 0x8b, 0x8a, -+ 0x70, 0x3e, 0xb5, 0x66, 0x48, 0x03, 0xf6, 0x0e, -+ 0x61, 0x35, 0x57, 0xb9, 0x86, 0xc1, 0x1d, 0x9e, -+ 0xe1, 0xf8, 0x98, 0x11, 0x69, 0xd9, 0x8e, 0x94, -+ 0x9b, 0x1e, 0x87, 0xe9, 0xce, 0x55, 0x28, 0xdf, -+ 0x8c, 0xa1, 0x89, 0x0d, 0xbf, 0xe6, 0x42, 0x68, -+ 0x41, 0x99, 0x2d, 0x0f, 0xb0, 0x54, 0xbb, 0x16 -+}; -+ -+// the inverse S-Box table -+ -+static const unsigned char inv_s_box[256] = -+{ -+ 0x52, 0x09, 0x6a, 0xd5, 0x30, 0x36, 0xa5, 0x38, -+ 0xbf, 0x40, 0xa3, 0x9e, 0x81, 0xf3, 0xd7, 0xfb, -+ 0x7c, 0xe3, 0x39, 0x82, 0x9b, 0x2f, 0xff, 0x87, -+ 0x34, 0x8e, 0x43, 0x44, 0xc4, 0xde, 0xe9, 0xcb, -+ 0x54, 0x7b, 0x94, 0x32, 0xa6, 0xc2, 0x23, 0x3d, -+ 0xee, 0x4c, 0x95, 0x0b, 0x42, 0xfa, 0xc3, 0x4e, -+ 0x08, 0x2e, 0xa1, 0x66, 0x28, 0xd9, 0x24, 0xb2, -+ 0x76, 0x5b, 0xa2, 0x49, 0x6d, 0x8b, 0xd1, 0x25, -+ 0x72, 0xf8, 0xf6, 0x64, 0x86, 0x68, 0x98, 0x16, -+ 0xd4, 0xa4, 0x5c, 0xcc, 0x5d, 0x65, 0xb6, 0x92, -+ 0x6c, 0x70, 0x48, 0x50, 0xfd, 0xed, 0xb9, 0xda, -+ 0x5e, 0x15, 0x46, 0x57, 0xa7, 0x8d, 0x9d, 0x84, -+ 0x90, 0xd8, 0xab, 0x00, 0x8c, 0xbc, 0xd3, 0x0a, -+ 0xf7, 0xe4, 0x58, 0x05, 0xb8, 0xb3, 0x45, 0x06, -+ 0xd0, 0x2c, 0x1e, 0x8f, 0xca, 0x3f, 0x0f, 0x02, -+ 0xc1, 0xaf, 0xbd, 0x03, 0x01, 0x13, 0x8a, 0x6b, -+ 0x3a, 0x91, 0x11, 0x41, 0x4f, 0x67, 0xdc, 0xea, -+ 0x97, 0xf2, 0xcf, 0xce, 0xf0, 0xb4, 0xe6, 0x73, -+ 0x96, 0xac, 0x74, 0x22, 0xe7, 0xad, 0x35, 0x85, -+ 0xe2, 0xf9, 0x37, 0xe8, 0x1c, 0x75, 0xdf, 0x6e, -+ 0x47, 0xf1, 0x1a, 0x71, 0x1d, 0x29, 0xc5, 0x89, -+ 0x6f, 0xb7, 0x62, 0x0e, 0xaa, 0x18, 0xbe, 0x1b, -+ 0xfc, 0x56, 0x3e, 0x4b, 0xc6, 0xd2, 0x79, 0x20, -+ 0x9a, 0xdb, 0xc0, 0xfe, 0x78, 0xcd, 0x5a, 0xf4, -+ 0x1f, 0xdd, 0xa8, 0x33, 0x88, 0x07, 0xc7, 0x31, -+ 0xb1, 0x12, 0x10, 0x59, 0x27, 0x80, 0xec, 0x5f, -+ 0x60, 0x51, 0x7f, 0xa9, 0x19, 0xb5, 0x4a, 0x0d, -+ 0x2d, 0xe5, 0x7a, 0x9f, 0x93, 0xc9, 0x9c, 0xef, -+ 0xa0, 0xe0, 0x3b, 0x4d, 0xae, 0x2a, 0xf5, 0xb0, -+ 0xc8, 0xeb, 0xbb, 0x3c, 0x83, 0x53, 0x99, 0x61, -+ 0x17, 0x2b, 0x04, 0x7e, 0xba, 0x77, 0xd6, 0x26, -+ 0xe1, 0x69, 0x14, 0x63, 0x55, 0x21, 0x0c, 0x7d -+}; -+ -+// used to ensure table is generated in the right format -+// depending on the internal byte order required -+ -+#if (INTERNAL_BYTE_ORDER < 2) -+/* little endian */ -+#define w0(p) 0x000000##p -+#else -+/* big endian */ -+#define w0(p) 0x##p##000000 -+#endif -+ -+// Number of elements required in this table for different -+// block and key lengths is: -+// -+// Nk = 4 6 8 -+// ---------- -+// Nb = 4 | 10 8 7 -+// 6 | 19 12 11 -+// 8 | 29 19 14 -+// -+// this table can be a table of bytes if the key schedule -+// code is adjusted accordingly -+ -+static const u_int32_t rcon_tab[29] = -+{ -+ w0(01), w0(02), w0(04), w0(08), -+ w0(10), w0(20), w0(40), w0(80), -+ w0(1b), w0(36), w0(6c), w0(d8), -+ w0(ab), w0(4d), w0(9a), w0(2f), -+ w0(5e), w0(bc), w0(63), w0(c6), -+ w0(97), w0(35), w0(6a), w0(d4), -+ w0(b3), w0(7d), w0(fa), w0(ef), -+ w0(c5) -+}; -+ -+#undef w0 -+ -+// used to ensure table is generated in the right format -+// depending on the internal byte order required -+ -+#if (INTERNAL_BYTE_ORDER < 2) -+/* little endian */ -+#define r0(p,q,r,s) 0x##p##q##r##s -+#define r1(p,q,r,s) 0x##q##r##s##p -+#define r2(p,q,r,s) 0x##r##s##p##q -+#define r3(p,q,r,s) 0x##s##p##q##r -+#define w0(p) 0x000000##p -+#define w1(p) 0x0000##p##00 -+#define w2(p) 0x00##p##0000 -+#define w3(p) 0x##p##000000 -+#else -+/* big endian */ -+#define r0(p,q,r,s) 0x##s##r##q##p -+#define r1(p,q,r,s) 0x##p##s##r##q -+#define r2(p,q,r,s) 0x##q##p##s##r -+#define r3(p,q,r,s) 0x##r##q##p##s -+#define w0(p) 0x##p##000000 -+#define w1(p) 0x00##p##0000 -+#define w2(p) 0x0000##p##00 -+#define w3(p) 0x000000##p -+#endif -+ -+#if defined(FIXED_TABLES) && (defined(ONE_TABLE) || defined(FOUR_TABLES)) -+ -+// data for forward tables (other than last round) -+ -+#define f_table \ -+ r(a5,63,63,c6), r(84,7c,7c,f8), r(99,77,77,ee), r(8d,7b,7b,f6),\ -+ r(0d,f2,f2,ff), r(bd,6b,6b,d6), r(b1,6f,6f,de), r(54,c5,c5,91),\ -+ r(50,30,30,60), r(03,01,01,02), r(a9,67,67,ce), r(7d,2b,2b,56),\ -+ r(19,fe,fe,e7), r(62,d7,d7,b5), r(e6,ab,ab,4d), r(9a,76,76,ec),\ -+ r(45,ca,ca,8f), r(9d,82,82,1f), r(40,c9,c9,89), r(87,7d,7d,fa),\ -+ r(15,fa,fa,ef), r(eb,59,59,b2), r(c9,47,47,8e), r(0b,f0,f0,fb),\ -+ r(ec,ad,ad,41), r(67,d4,d4,b3), r(fd,a2,a2,5f), r(ea,af,af,45),\ -+ r(bf,9c,9c,23), r(f7,a4,a4,53), r(96,72,72,e4), r(5b,c0,c0,9b),\ -+ r(c2,b7,b7,75), r(1c,fd,fd,e1), r(ae,93,93,3d), r(6a,26,26,4c),\ -+ r(5a,36,36,6c), r(41,3f,3f,7e), r(02,f7,f7,f5), r(4f,cc,cc,83),\ -+ r(5c,34,34,68), r(f4,a5,a5,51), r(34,e5,e5,d1), r(08,f1,f1,f9),\ -+ r(93,71,71,e2), r(73,d8,d8,ab), r(53,31,31,62), r(3f,15,15,2a),\ -+ r(0c,04,04,08), r(52,c7,c7,95), r(65,23,23,46), r(5e,c3,c3,9d),\ -+ r(28,18,18,30), r(a1,96,96,37), r(0f,05,05,0a), r(b5,9a,9a,2f),\ -+ r(09,07,07,0e), r(36,12,12,24), r(9b,80,80,1b), r(3d,e2,e2,df),\ -+ r(26,eb,eb,cd), r(69,27,27,4e), r(cd,b2,b2,7f), r(9f,75,75,ea),\ -+ r(1b,09,09,12), r(9e,83,83,1d), r(74,2c,2c,58), r(2e,1a,1a,34),\ -+ r(2d,1b,1b,36), r(b2,6e,6e,dc), r(ee,5a,5a,b4), r(fb,a0,a0,5b),\ -+ r(f6,52,52,a4), r(4d,3b,3b,76), r(61,d6,d6,b7), r(ce,b3,b3,7d),\ -+ r(7b,29,29,52), r(3e,e3,e3,dd), r(71,2f,2f,5e), r(97,84,84,13),\ -+ r(f5,53,53,a6), r(68,d1,d1,b9), r(00,00,00,00), r(2c,ed,ed,c1),\ -+ r(60,20,20,40), r(1f,fc,fc,e3), r(c8,b1,b1,79), r(ed,5b,5b,b6),\ -+ r(be,6a,6a,d4), r(46,cb,cb,8d), r(d9,be,be,67), r(4b,39,39,72),\ -+ r(de,4a,4a,94), r(d4,4c,4c,98), r(e8,58,58,b0), r(4a,cf,cf,85),\ -+ r(6b,d0,d0,bb), r(2a,ef,ef,c5), r(e5,aa,aa,4f), r(16,fb,fb,ed),\ -+ r(c5,43,43,86), r(d7,4d,4d,9a), r(55,33,33,66), r(94,85,85,11),\ -+ r(cf,45,45,8a), r(10,f9,f9,e9), r(06,02,02,04), r(81,7f,7f,fe),\ -+ r(f0,50,50,a0), r(44,3c,3c,78), r(ba,9f,9f,25), r(e3,a8,a8,4b),\ -+ r(f3,51,51,a2), r(fe,a3,a3,5d), r(c0,40,40,80), r(8a,8f,8f,05),\ -+ r(ad,92,92,3f), r(bc,9d,9d,21), r(48,38,38,70), r(04,f5,f5,f1),\ -+ r(df,bc,bc,63), r(c1,b6,b6,77), r(75,da,da,af), r(63,21,21,42),\ -+ r(30,10,10,20), r(1a,ff,ff,e5), r(0e,f3,f3,fd), r(6d,d2,d2,bf),\ -+ r(4c,cd,cd,81), r(14,0c,0c,18), r(35,13,13,26), r(2f,ec,ec,c3),\ -+ r(e1,5f,5f,be), r(a2,97,97,35), r(cc,44,44,88), r(39,17,17,2e),\ -+ r(57,c4,c4,93), r(f2,a7,a7,55), r(82,7e,7e,fc), r(47,3d,3d,7a),\ -+ r(ac,64,64,c8), r(e7,5d,5d,ba), r(2b,19,19,32), r(95,73,73,e6),\ -+ r(a0,60,60,c0), r(98,81,81,19), r(d1,4f,4f,9e), r(7f,dc,dc,a3),\ -+ r(66,22,22,44), r(7e,2a,2a,54), r(ab,90,90,3b), r(83,88,88,0b),\ -+ r(ca,46,46,8c), r(29,ee,ee,c7), r(d3,b8,b8,6b), r(3c,14,14,28),\ -+ r(79,de,de,a7), r(e2,5e,5e,bc), r(1d,0b,0b,16), r(76,db,db,ad),\ -+ r(3b,e0,e0,db), r(56,32,32,64), r(4e,3a,3a,74), r(1e,0a,0a,14),\ -+ r(db,49,49,92), r(0a,06,06,0c), r(6c,24,24,48), r(e4,5c,5c,b8),\ -+ r(5d,c2,c2,9f), r(6e,d3,d3,bd), r(ef,ac,ac,43), r(a6,62,62,c4),\ -+ r(a8,91,91,39), r(a4,95,95,31), r(37,e4,e4,d3), r(8b,79,79,f2),\ -+ r(32,e7,e7,d5), r(43,c8,c8,8b), r(59,37,37,6e), r(b7,6d,6d,da),\ -+ r(8c,8d,8d,01), r(64,d5,d5,b1), r(d2,4e,4e,9c), r(e0,a9,a9,49),\ -+ r(b4,6c,6c,d8), r(fa,56,56,ac), r(07,f4,f4,f3), r(25,ea,ea,cf),\ -+ r(af,65,65,ca), r(8e,7a,7a,f4), r(e9,ae,ae,47), r(18,08,08,10),\ -+ r(d5,ba,ba,6f), r(88,78,78,f0), r(6f,25,25,4a), r(72,2e,2e,5c),\ -+ r(24,1c,1c,38), r(f1,a6,a6,57), r(c7,b4,b4,73), r(51,c6,c6,97),\ -+ r(23,e8,e8,cb), r(7c,dd,dd,a1), r(9c,74,74,e8), r(21,1f,1f,3e),\ -+ r(dd,4b,4b,96), r(dc,bd,bd,61), r(86,8b,8b,0d), r(85,8a,8a,0f),\ -+ r(90,70,70,e0), r(42,3e,3e,7c), r(c4,b5,b5,71), r(aa,66,66,cc),\ -+ r(d8,48,48,90), r(05,03,03,06), r(01,f6,f6,f7), r(12,0e,0e,1c),\ -+ r(a3,61,61,c2), r(5f,35,35,6a), r(f9,57,57,ae), r(d0,b9,b9,69),\ -+ r(91,86,86,17), r(58,c1,c1,99), r(27,1d,1d,3a), r(b9,9e,9e,27),\ -+ r(38,e1,e1,d9), r(13,f8,f8,eb), r(b3,98,98,2b), r(33,11,11,22),\ -+ r(bb,69,69,d2), r(70,d9,d9,a9), r(89,8e,8e,07), r(a7,94,94,33),\ -+ r(b6,9b,9b,2d), r(22,1e,1e,3c), r(92,87,87,15), r(20,e9,e9,c9),\ -+ r(49,ce,ce,87), r(ff,55,55,aa), r(78,28,28,50), r(7a,df,df,a5),\ -+ r(8f,8c,8c,03), r(f8,a1,a1,59), r(80,89,89,09), r(17,0d,0d,1a),\ -+ r(da,bf,bf,65), r(31,e6,e6,d7), r(c6,42,42,84), r(b8,68,68,d0),\ -+ r(c3,41,41,82), r(b0,99,99,29), r(77,2d,2d,5a), r(11,0f,0f,1e),\ -+ r(cb,b0,b0,7b), r(fc,54,54,a8), r(d6,bb,bb,6d), r(3a,16,16,2c) -+ -+// data for inverse tables (other than last round) -+ -+#define i_table \ -+ r(50,a7,f4,51), r(53,65,41,7e), r(c3,a4,17,1a), r(96,5e,27,3a),\ -+ r(cb,6b,ab,3b), r(f1,45,9d,1f), r(ab,58,fa,ac), r(93,03,e3,4b),\ -+ r(55,fa,30,20), r(f6,6d,76,ad), r(91,76,cc,88), r(25,4c,02,f5),\ -+ r(fc,d7,e5,4f), r(d7,cb,2a,c5), r(80,44,35,26), r(8f,a3,62,b5),\ -+ r(49,5a,b1,de), r(67,1b,ba,25), r(98,0e,ea,45), r(e1,c0,fe,5d),\ -+ r(02,75,2f,c3), r(12,f0,4c,81), r(a3,97,46,8d), r(c6,f9,d3,6b),\ -+ r(e7,5f,8f,03), r(95,9c,92,15), r(eb,7a,6d,bf), r(da,59,52,95),\ -+ r(2d,83,be,d4), r(d3,21,74,58), r(29,69,e0,49), r(44,c8,c9,8e),\ -+ r(6a,89,c2,75), r(78,79,8e,f4), r(6b,3e,58,99), r(dd,71,b9,27),\ -+ r(b6,4f,e1,be), r(17,ad,88,f0), r(66,ac,20,c9), r(b4,3a,ce,7d),\ -+ r(18,4a,df,63), r(82,31,1a,e5), r(60,33,51,97), r(45,7f,53,62),\ -+ r(e0,77,64,b1), r(84,ae,6b,bb), r(1c,a0,81,fe), r(94,2b,08,f9),\ -+ r(58,68,48,70), r(19,fd,45,8f), r(87,6c,de,94), r(b7,f8,7b,52),\ -+ r(23,d3,73,ab), r(e2,02,4b,72), r(57,8f,1f,e3), r(2a,ab,55,66),\ -+ r(07,28,eb,b2), r(03,c2,b5,2f), r(9a,7b,c5,86), r(a5,08,37,d3),\ -+ r(f2,87,28,30), r(b2,a5,bf,23), r(ba,6a,03,02), r(5c,82,16,ed),\ -+ r(2b,1c,cf,8a), r(92,b4,79,a7), r(f0,f2,07,f3), r(a1,e2,69,4e),\ -+ r(cd,f4,da,65), r(d5,be,05,06), r(1f,62,34,d1), r(8a,fe,a6,c4),\ -+ r(9d,53,2e,34), r(a0,55,f3,a2), r(32,e1,8a,05), r(75,eb,f6,a4),\ -+ r(39,ec,83,0b), r(aa,ef,60,40), r(06,9f,71,5e), r(51,10,6e,bd),\ -+ r(f9,8a,21,3e), r(3d,06,dd,96), r(ae,05,3e,dd), r(46,bd,e6,4d),\ -+ r(b5,8d,54,91), r(05,5d,c4,71), r(6f,d4,06,04), r(ff,15,50,60),\ -+ r(24,fb,98,19), r(97,e9,bd,d6), r(cc,43,40,89), r(77,9e,d9,67),\ -+ r(bd,42,e8,b0), r(88,8b,89,07), r(38,5b,19,e7), r(db,ee,c8,79),\ -+ r(47,0a,7c,a1), r(e9,0f,42,7c), r(c9,1e,84,f8), r(00,00,00,00),\ -+ r(83,86,80,09), r(48,ed,2b,32), r(ac,70,11,1e), r(4e,72,5a,6c),\ -+ r(fb,ff,0e,fd), r(56,38,85,0f), r(1e,d5,ae,3d), r(27,39,2d,36),\ -+ r(64,d9,0f,0a), r(21,a6,5c,68), r(d1,54,5b,9b), r(3a,2e,36,24),\ -+ r(b1,67,0a,0c), r(0f,e7,57,93), r(d2,96,ee,b4), r(9e,91,9b,1b),\ -+ r(4f,c5,c0,80), r(a2,20,dc,61), r(69,4b,77,5a), r(16,1a,12,1c),\ -+ r(0a,ba,93,e2), r(e5,2a,a0,c0), r(43,e0,22,3c), r(1d,17,1b,12),\ -+ r(0b,0d,09,0e), r(ad,c7,8b,f2), r(b9,a8,b6,2d), r(c8,a9,1e,14),\ -+ r(85,19,f1,57), r(4c,07,75,af), r(bb,dd,99,ee), r(fd,60,7f,a3),\ -+ r(9f,26,01,f7), r(bc,f5,72,5c), r(c5,3b,66,44), r(34,7e,fb,5b),\ -+ r(76,29,43,8b), r(dc,c6,23,cb), r(68,fc,ed,b6), r(63,f1,e4,b8),\ -+ r(ca,dc,31,d7), r(10,85,63,42), r(40,22,97,13), r(20,11,c6,84),\ -+ r(7d,24,4a,85), r(f8,3d,bb,d2), r(11,32,f9,ae), r(6d,a1,29,c7),\ -+ r(4b,2f,9e,1d), r(f3,30,b2,dc), r(ec,52,86,0d), r(d0,e3,c1,77),\ -+ r(6c,16,b3,2b), r(99,b9,70,a9), r(fa,48,94,11), r(22,64,e9,47),\ -+ r(c4,8c,fc,a8), r(1a,3f,f0,a0), r(d8,2c,7d,56), r(ef,90,33,22),\ -+ r(c7,4e,49,87), r(c1,d1,38,d9), r(fe,a2,ca,8c), r(36,0b,d4,98),\ -+ r(cf,81,f5,a6), r(28,de,7a,a5), r(26,8e,b7,da), r(a4,bf,ad,3f),\ -+ r(e4,9d,3a,2c), r(0d,92,78,50), r(9b,cc,5f,6a), r(62,46,7e,54),\ -+ r(c2,13,8d,f6), r(e8,b8,d8,90), r(5e,f7,39,2e), r(f5,af,c3,82),\ -+ r(be,80,5d,9f), r(7c,93,d0,69), r(a9,2d,d5,6f), r(b3,12,25,cf),\ -+ r(3b,99,ac,c8), r(a7,7d,18,10), r(6e,63,9c,e8), r(7b,bb,3b,db),\ -+ r(09,78,26,cd), r(f4,18,59,6e), r(01,b7,9a,ec), r(a8,9a,4f,83),\ -+ r(65,6e,95,e6), r(7e,e6,ff,aa), r(08,cf,bc,21), r(e6,e8,15,ef),\ -+ r(d9,9b,e7,ba), r(ce,36,6f,4a), r(d4,09,9f,ea), r(d6,7c,b0,29),\ -+ r(af,b2,a4,31), r(31,23,3f,2a), r(30,94,a5,c6), r(c0,66,a2,35),\ -+ r(37,bc,4e,74), r(a6,ca,82,fc), r(b0,d0,90,e0), r(15,d8,a7,33),\ -+ r(4a,98,04,f1), r(f7,da,ec,41), r(0e,50,cd,7f), r(2f,f6,91,17),\ -+ r(8d,d6,4d,76), r(4d,b0,ef,43), r(54,4d,aa,cc), r(df,04,96,e4),\ -+ r(e3,b5,d1,9e), r(1b,88,6a,4c), r(b8,1f,2c,c1), r(7f,51,65,46),\ -+ r(04,ea,5e,9d), r(5d,35,8c,01), r(73,74,87,fa), r(2e,41,0b,fb),\ -+ r(5a,1d,67,b3), r(52,d2,db,92), r(33,56,10,e9), r(13,47,d6,6d),\ -+ r(8c,61,d7,9a), r(7a,0c,a1,37), r(8e,14,f8,59), r(89,3c,13,eb),\ -+ r(ee,27,a9,ce), r(35,c9,61,b7), r(ed,e5,1c,e1), r(3c,b1,47,7a),\ -+ r(59,df,d2,9c), r(3f,73,f2,55), r(79,ce,14,18), r(bf,37,c7,73),\ -+ r(ea,cd,f7,53), r(5b,aa,fd,5f), r(14,6f,3d,df), r(86,db,44,78),\ -+ r(81,f3,af,ca), r(3e,c4,68,b9), r(2c,34,24,38), r(5f,40,a3,c2),\ -+ r(72,c3,1d,16), r(0c,25,e2,bc), r(8b,49,3c,28), r(41,95,0d,ff),\ -+ r(71,01,a8,39), r(de,b3,0c,08), r(9c,e4,b4,d8), r(90,c1,56,64),\ -+ r(61,84,cb,7b), r(70,b6,32,d5), r(74,5c,6c,48), r(42,57,b8,d0) -+ -+// generate the required tables in the desired endian format -+ -+#undef r -+#define r r0 -+ -+#if defined(ONE_TABLE) -+static const u_int32_t ft_tab[256] = -+ { f_table }; -+#elif defined(FOUR_TABLES) -+static const u_int32_t ft_tab[4][256] = -+{ { f_table }, -+#undef r -+#define r r1 -+ { f_table }, -+#undef r -+#define r r2 -+ { f_table }, -+#undef r -+#define r r3 -+ { f_table } -+}; -+#endif -+ -+#undef r -+#define r r0 -+#if defined(ONE_TABLE) -+static const u_int32_t it_tab[256] = -+ { i_table }; -+#elif defined(FOUR_TABLES) -+static const u_int32_t it_tab[4][256] = -+{ { i_table }, -+#undef r -+#define r r1 -+ { i_table }, -+#undef r -+#define r r2 -+ { i_table }, -+#undef r -+#define r r3 -+ { i_table } -+}; -+#endif -+ -+#endif -+ -+#if defined(FIXED_TABLES) && (defined(ONE_LR_TABLE) || defined(FOUR_LR_TABLES)) -+ -+// data for inverse tables (last round) -+ -+#define li_table \ -+ w(52), w(09), w(6a), w(d5), w(30), w(36), w(a5), w(38),\ -+ w(bf), w(40), w(a3), w(9e), w(81), w(f3), w(d7), w(fb),\ -+ w(7c), w(e3), w(39), w(82), w(9b), w(2f), w(ff), w(87),\ -+ w(34), w(8e), w(43), w(44), w(c4), w(de), w(e9), w(cb),\ -+ w(54), w(7b), w(94), w(32), w(a6), w(c2), w(23), w(3d),\ -+ w(ee), w(4c), w(95), w(0b), w(42), w(fa), w(c3), w(4e),\ -+ w(08), w(2e), w(a1), w(66), w(28), w(d9), w(24), w(b2),\ -+ w(76), w(5b), w(a2), w(49), w(6d), w(8b), w(d1), w(25),\ -+ w(72), w(f8), w(f6), w(64), w(86), w(68), w(98), w(16),\ -+ w(d4), w(a4), w(5c), w(cc), w(5d), w(65), w(b6), w(92),\ -+ w(6c), w(70), w(48), w(50), w(fd), w(ed), w(b9), w(da),\ -+ w(5e), w(15), w(46), w(57), w(a7), w(8d), w(9d), w(84),\ -+ w(90), w(d8), w(ab), w(00), w(8c), w(bc), w(d3), w(0a),\ -+ w(f7), w(e4), w(58), w(05), w(b8), w(b3), w(45), w(06),\ -+ w(d0), w(2c), w(1e), w(8f), w(ca), w(3f), w(0f), w(02),\ -+ w(c1), w(af), w(bd), w(03), w(01), w(13), w(8a), w(6b),\ -+ w(3a), w(91), w(11), w(41), w(4f), w(67), w(dc), w(ea),\ -+ w(97), w(f2), w(cf), w(ce), w(f0), w(b4), w(e6), w(73),\ -+ w(96), w(ac), w(74), w(22), w(e7), w(ad), w(35), w(85),\ -+ w(e2), w(f9), w(37), w(e8), w(1c), w(75), w(df), w(6e),\ -+ w(47), w(f1), w(1a), w(71), w(1d), w(29), w(c5), w(89),\ -+ w(6f), w(b7), w(62), w(0e), w(aa), w(18), w(be), w(1b),\ -+ w(fc), w(56), w(3e), w(4b), w(c6), w(d2), w(79), w(20),\ -+ w(9a), w(db), w(c0), w(fe), w(78), w(cd), w(5a), w(f4),\ -+ w(1f), w(dd), w(a8), w(33), w(88), w(07), w(c7), w(31),\ -+ w(b1), w(12), w(10), w(59), w(27), w(80), w(ec), w(5f),\ -+ w(60), w(51), w(7f), w(a9), w(19), w(b5), w(4a), w(0d),\ -+ w(2d), w(e5), w(7a), w(9f), w(93), w(c9), w(9c), w(ef),\ -+ w(a0), w(e0), w(3b), w(4d), w(ae), w(2a), w(f5), w(b0),\ -+ w(c8), w(eb), w(bb), w(3c), w(83), w(53), w(99), w(61),\ -+ w(17), w(2b), w(04), w(7e), w(ba), w(77), w(d6), w(26),\ -+ w(e1), w(69), w(14), w(63), w(55), w(21), w(0c), w(7d), -+ -+// generate the required tables in the desired endian format -+ -+#undef r -+#define r(p,q,r,s) w0(q) -+#if defined(ONE_LR_TABLE) -+static const u_int32_t fl_tab[256] = -+ { f_table }; -+#elif defined(FOUR_LR_TABLES) -+static const u_int32_t fl_tab[4][256] = -+{ { f_table }, -+#undef r -+#define r(p,q,r,s) w1(q) -+ { f_table }, -+#undef r -+#define r(p,q,r,s) w2(q) -+ { f_table }, -+#undef r -+#define r(p,q,r,s) w3(q) -+ { f_table } -+}; -+#endif -+ -+#undef w -+#define w w0 -+#if defined(ONE_LR_TABLE) -+static const u_int32_t il_tab[256] = -+ { li_table }; -+#elif defined(FOUR_LR_TABLES) -+static const u_int32_t il_tab[4][256] = -+{ { li_table }, -+#undef w -+#define w w1 -+ { li_table }, -+#undef w -+#define w w2 -+ { li_table }, -+#undef w -+#define w w3 -+ { li_table } -+}; -+#endif -+ -+#endif -+ -+#if defined(FIXED_TABLES) && (defined(ONE_IM_TABLE) || defined(FOUR_IM_TABLES)) -+ -+#define m_table \ -+ r(00,00,00,00), r(0b,0d,09,0e), r(16,1a,12,1c), r(1d,17,1b,12),\ -+ r(2c,34,24,38), r(27,39,2d,36), r(3a,2e,36,24), r(31,23,3f,2a),\ -+ r(58,68,48,70), r(53,65,41,7e), r(4e,72,5a,6c), r(45,7f,53,62),\ -+ r(74,5c,6c,48), r(7f,51,65,46), r(62,46,7e,54), r(69,4b,77,5a),\ -+ r(b0,d0,90,e0), r(bb,dd,99,ee), r(a6,ca,82,fc), r(ad,c7,8b,f2),\ -+ r(9c,e4,b4,d8), r(97,e9,bd,d6), r(8a,fe,a6,c4), r(81,f3,af,ca),\ -+ r(e8,b8,d8,90), r(e3,b5,d1,9e), r(fe,a2,ca,8c), r(f5,af,c3,82),\ -+ r(c4,8c,fc,a8), r(cf,81,f5,a6), r(d2,96,ee,b4), r(d9,9b,e7,ba),\ -+ r(7b,bb,3b,db), r(70,b6,32,d5), r(6d,a1,29,c7), r(66,ac,20,c9),\ -+ r(57,8f,1f,e3), r(5c,82,16,ed), r(41,95,0d,ff), r(4a,98,04,f1),\ -+ r(23,d3,73,ab), r(28,de,7a,a5), r(35,c9,61,b7), r(3e,c4,68,b9),\ -+ r(0f,e7,57,93), r(04,ea,5e,9d), r(19,fd,45,8f), r(12,f0,4c,81),\ -+ r(cb,6b,ab,3b), r(c0,66,a2,35), r(dd,71,b9,27), r(d6,7c,b0,29),\ -+ r(e7,5f,8f,03), r(ec,52,86,0d), r(f1,45,9d,1f), r(fa,48,94,11),\ -+ r(93,03,e3,4b), r(98,0e,ea,45), r(85,19,f1,57), r(8e,14,f8,59),\ -+ r(bf,37,c7,73), r(b4,3a,ce,7d), r(a9,2d,d5,6f), r(a2,20,dc,61),\ -+ r(f6,6d,76,ad), r(fd,60,7f,a3), r(e0,77,64,b1), r(eb,7a,6d,bf),\ -+ r(da,59,52,95), r(d1,54,5b,9b), r(cc,43,40,89), r(c7,4e,49,87),\ -+ r(ae,05,3e,dd), r(a5,08,37,d3), r(b8,1f,2c,c1), r(b3,12,25,cf),\ -+ r(82,31,1a,e5), r(89,3c,13,eb), r(94,2b,08,f9), r(9f,26,01,f7),\ -+ r(46,bd,e6,4d), r(4d,b0,ef,43), r(50,a7,f4,51), r(5b,aa,fd,5f),\ -+ r(6a,89,c2,75), r(61,84,cb,7b), r(7c,93,d0,69), r(77,9e,d9,67),\ -+ r(1e,d5,ae,3d), r(15,d8,a7,33), r(08,cf,bc,21), r(03,c2,b5,2f),\ -+ r(32,e1,8a,05), r(39,ec,83,0b), r(24,fb,98,19), r(2f,f6,91,17),\ -+ r(8d,d6,4d,76), r(86,db,44,78), r(9b,cc,5f,6a), r(90,c1,56,64),\ -+ r(a1,e2,69,4e), r(aa,ef,60,40), r(b7,f8,7b,52), r(bc,f5,72,5c),\ -+ r(d5,be,05,06), r(de,b3,0c,08), r(c3,a4,17,1a), r(c8,a9,1e,14),\ -+ r(f9,8a,21,3e), r(f2,87,28,30), r(ef,90,33,22), r(e4,9d,3a,2c),\ -+ r(3d,06,dd,96), r(36,0b,d4,98), r(2b,1c,cf,8a), r(20,11,c6,84),\ -+ r(11,32,f9,ae), r(1a,3f,f0,a0), r(07,28,eb,b2), r(0c,25,e2,bc),\ -+ r(65,6e,95,e6), r(6e,63,9c,e8), r(73,74,87,fa), r(78,79,8e,f4),\ -+ r(49,5a,b1,de), r(42,57,b8,d0), r(5f,40,a3,c2), r(54,4d,aa,cc),\ -+ r(f7,da,ec,41), r(fc,d7,e5,4f), r(e1,c0,fe,5d), r(ea,cd,f7,53),\ -+ r(db,ee,c8,79), r(d0,e3,c1,77), r(cd,f4,da,65), r(c6,f9,d3,6b),\ -+ r(af,b2,a4,31), r(a4,bf,ad,3f), r(b9,a8,b6,2d), r(b2,a5,bf,23),\ -+ r(83,86,80,09), r(88,8b,89,07), r(95,9c,92,15), r(9e,91,9b,1b),\ -+ r(47,0a,7c,a1), r(4c,07,75,af), r(51,10,6e,bd), r(5a,1d,67,b3),\ -+ r(6b,3e,58,99), r(60,33,51,97), r(7d,24,4a,85), r(76,29,43,8b),\ -+ r(1f,62,34,d1), r(14,6f,3d,df), r(09,78,26,cd), r(02,75,2f,c3),\ -+ r(33,56,10,e9), r(38,5b,19,e7), r(25,4c,02,f5), r(2e,41,0b,fb),\ -+ r(8c,61,d7,9a), r(87,6c,de,94), r(9a,7b,c5,86), r(91,76,cc,88),\ -+ r(a0,55,f3,a2), r(ab,58,fa,ac), r(b6,4f,e1,be), r(bd,42,e8,b0),\ -+ r(d4,09,9f,ea), r(df,04,96,e4), r(c2,13,8d,f6), r(c9,1e,84,f8),\ -+ r(f8,3d,bb,d2), r(f3,30,b2,dc), r(ee,27,a9,ce), r(e5,2a,a0,c0),\ -+ r(3c,b1,47,7a), r(37,bc,4e,74), r(2a,ab,55,66), r(21,a6,5c,68),\ -+ r(10,85,63,42), r(1b,88,6a,4c), r(06,9f,71,5e), r(0d,92,78,50),\ -+ r(64,d9,0f,0a), r(6f,d4,06,04), r(72,c3,1d,16), r(79,ce,14,18),\ -+ r(48,ed,2b,32), r(43,e0,22,3c), r(5e,f7,39,2e), r(55,fa,30,20),\ -+ r(01,b7,9a,ec), r(0a,ba,93,e2), r(17,ad,88,f0), r(1c,a0,81,fe),\ -+ r(2d,83,be,d4), r(26,8e,b7,da), r(3b,99,ac,c8), r(30,94,a5,c6),\ -+ r(59,df,d2,9c), r(52,d2,db,92), r(4f,c5,c0,80), r(44,c8,c9,8e),\ -+ r(75,eb,f6,a4), r(7e,e6,ff,aa), r(63,f1,e4,b8), r(68,fc,ed,b6),\ -+ r(b1,67,0a,0c), r(ba,6a,03,02), r(a7,7d,18,10), r(ac,70,11,1e),\ -+ r(9d,53,2e,34), r(96,5e,27,3a), r(8b,49,3c,28), r(80,44,35,26),\ -+ r(e9,0f,42,7c), r(e2,02,4b,72), r(ff,15,50,60), r(f4,18,59,6e),\ -+ r(c5,3b,66,44), r(ce,36,6f,4a), r(d3,21,74,58), r(d8,2c,7d,56),\ -+ r(7a,0c,a1,37), r(71,01,a8,39), r(6c,16,b3,2b), r(67,1b,ba,25),\ -+ r(56,38,85,0f), r(5d,35,8c,01), r(40,22,97,13), r(4b,2f,9e,1d),\ -+ r(22,64,e9,47), r(29,69,e0,49), r(34,7e,fb,5b), r(3f,73,f2,55),\ -+ r(0e,50,cd,7f), r(05,5d,c4,71), r(18,4a,df,63), r(13,47,d6,6d),\ -+ r(ca,dc,31,d7), r(c1,d1,38,d9), r(dc,c6,23,cb), r(d7,cb,2a,c5),\ -+ r(e6,e8,15,ef), r(ed,e5,1c,e1), r(f0,f2,07,f3), r(fb,ff,0e,fd),\ -+ r(92,b4,79,a7), r(99,b9,70,a9), r(84,ae,6b,bb), r(8f,a3,62,b5),\ -+ r(be,80,5d,9f), r(b5,8d,54,91), r(a8,9a,4f,83), r(a3,97,46,8d) -+ -+#undef r -+#define r r0 -+ -+#if defined(ONE_IM_TABLE) -+static const u_int32_t im_tab[256] = -+ { m_table }; -+#elif defined(FOUR_IM_TABLES) -+static const u_int32_t im_tab[4][256] = -+{ { m_table }, -+#undef r -+#define r r1 -+ { m_table }, -+#undef r -+#define r r2 -+ { m_table }, -+#undef r -+#define r r3 -+ { m_table } -+}; -+#endif -+ -+#endif -+ -+#else -+ -+static int tab_gen = 0; -+ -+static unsigned char s_box[256]; // the S box -+static unsigned char inv_s_box[256]; // the inverse S box -+static u_int32_t rcon_tab[AES_RC_LENGTH]; // table of round constants -+ -+#if defined(ONE_TABLE) -+static u_int32_t ft_tab[256]; -+static u_int32_t it_tab[256]; -+#elif defined(FOUR_TABLES) -+static u_int32_t ft_tab[4][256]; -+static u_int32_t it_tab[4][256]; -+#endif -+ -+#if defined(ONE_LR_TABLE) -+static u_int32_t fl_tab[256]; -+static u_int32_t il_tab[256]; -+#elif defined(FOUR_LR_TABLES) -+static u_int32_t fl_tab[4][256]; -+static u_int32_t il_tab[4][256]; -+#endif -+ -+#if defined(ONE_IM_TABLE) -+static u_int32_t im_tab[256]; -+#elif defined(FOUR_IM_TABLES) -+static u_int32_t im_tab[4][256]; -+#endif -+ -+// Generate the tables for the dynamic table option -+ -+#if !defined(FF_TABLES) -+ -+// It will generally be sensible to use tables to compute finite -+// field multiplies and inverses but where memory is scarse this -+// code might sometimes be better. -+ -+// return 2 ^ (n - 1) where n is the bit number of the highest bit -+// set in x with x in the range 1 < x < 0x00000200. This form is -+// used so that locals within FFinv can be bytes rather than words -+ -+static unsigned char hibit(const u_int32_t x) -+{ unsigned char r = (unsigned char)((x >> 1) | (x >> 2)); -+ -+ r |= (r >> 2); -+ r |= (r >> 4); -+ return (r + 1) >> 1; -+} -+ -+// return the inverse of the finite field element x -+ -+static unsigned char FFinv(const unsigned char x) -+{ unsigned char p1 = x, p2 = 0x1b, n1 = hibit(x), n2 = 0x80, v1 = 1, v2 = 0; -+ -+ if(x < 2) return x; -+ -+ for(;;) -+ { -+ if(!n1) return v1; -+ -+ while(n2 >= n1) -+ { -+ n2 /= n1; p2 ^= p1 * n2; v2 ^= v1 * n2; n2 = hibit(p2); -+ } -+ -+ if(!n2) return v2; -+ -+ while(n1 >= n2) -+ { -+ n1 /= n2; p1 ^= p2 * n1; v1 ^= v2 * n1; n1 = hibit(p1); -+ } -+ } -+} -+ -+// define the finite field multiplies required for Rijndael -+ -+#define FFmul02(x) ((((x) & 0x7f) << 1) ^ ((x) & 0x80 ? 0x1b : 0)) -+#define FFmul03(x) ((x) ^ FFmul02(x)) -+#define FFmul09(x) ((x) ^ FFmul02(FFmul02(FFmul02(x)))) -+#define FFmul0b(x) ((x) ^ FFmul02((x) ^ FFmul02(FFmul02(x)))) -+#define FFmul0d(x) ((x) ^ FFmul02(FFmul02((x) ^ FFmul02(x)))) -+#define FFmul0e(x) FFmul02((x) ^ FFmul02((x) ^ FFmul02(x))) -+ -+#else -+ -+#define FFinv(x) ((x) ? pow[255 - log[x]]: 0) -+ -+#define FFmul02(x) (x ? pow[log[x] + 0x19] : 0) -+#define FFmul03(x) (x ? pow[log[x] + 0x01] : 0) -+#define FFmul09(x) (x ? pow[log[x] + 0xc7] : 0) -+#define FFmul0b(x) (x ? pow[log[x] + 0x68] : 0) -+#define FFmul0d(x) (x ? pow[log[x] + 0xee] : 0) -+#define FFmul0e(x) (x ? pow[log[x] + 0xdf] : 0) -+ -+#endif -+ -+// The forward and inverse affine transformations used in the S-box -+ -+#define fwd_affine(x) \ -+ (w = (u_int32_t)x, w ^= (w<<1)^(w<<2)^(w<<3)^(w<<4), 0x63^(unsigned char)(w^(w>>8))) -+ -+#define inv_affine(x) \ -+ (w = (u_int32_t)x, w = (w<<1)^(w<<3)^(w<<6), 0x05^(unsigned char)(w^(w>>8))) -+ -+static void gen_tabs(void) -+{ u_int32_t i, w; -+ -+#if defined(FF_TABLES) -+ -+ unsigned char pow[512], log[256]; -+ -+ // log and power tables for GF(2^8) finite field with -+ // 0x011b as modular polynomial - the simplest primitive -+ // root is 0x03, used here to generate the tables -+ -+ i = 0; w = 1; -+ do -+ { -+ pow[i] = (unsigned char)w; -+ pow[i + 255] = (unsigned char)w; -+ log[w] = (unsigned char)i++; -+ w ^= (w << 1) ^ (w & ff_hi ? ff_poly : 0); -+ } -+ while (w != 1); -+ -+#endif -+ -+ for(i = 0, w = 1; i < AES_RC_LENGTH; ++i) -+ { -+ rcon_tab[i] = bytes2word(w, 0, 0, 0); -+ w = (w << 1) ^ (w & ff_hi ? ff_poly : 0); -+ } -+ -+ for(i = 0; i < 256; ++i) -+ { unsigned char b; -+ -+ s_box[i] = b = fwd_affine(FFinv((unsigned char)i)); -+ -+ w = bytes2word(b, 0, 0, 0); -+#if defined(ONE_LR_TABLE) -+ fl_tab[i] = w; -+#elif defined(FOUR_LR_TABLES) -+ fl_tab[0][i] = w; -+ fl_tab[1][i] = upr(w,1); -+ fl_tab[2][i] = upr(w,2); -+ fl_tab[3][i] = upr(w,3); -+#endif -+ w = bytes2word(FFmul02(b), b, b, FFmul03(b)); -+#if defined(ONE_TABLE) -+ ft_tab[i] = w; -+#elif defined(FOUR_TABLES) -+ ft_tab[0][i] = w; -+ ft_tab[1][i] = upr(w,1); -+ ft_tab[2][i] = upr(w,2); -+ ft_tab[3][i] = upr(w,3); -+#endif -+ inv_s_box[i] = b = FFinv(inv_affine((unsigned char)i)); -+ -+ w = bytes2word(b, 0, 0, 0); -+#if defined(ONE_LR_TABLE) -+ il_tab[i] = w; -+#elif defined(FOUR_LR_TABLES) -+ il_tab[0][i] = w; -+ il_tab[1][i] = upr(w,1); -+ il_tab[2][i] = upr(w,2); -+ il_tab[3][i] = upr(w,3); -+#endif -+ w = bytes2word(FFmul0e(b), FFmul09(b), FFmul0d(b), FFmul0b(b)); -+#if defined(ONE_TABLE) -+ it_tab[i] = w; -+#elif defined(FOUR_TABLES) -+ it_tab[0][i] = w; -+ it_tab[1][i] = upr(w,1); -+ it_tab[2][i] = upr(w,2); -+ it_tab[3][i] = upr(w,3); -+#endif -+#if defined(ONE_IM_TABLE) -+ im_tab[b] = w; -+#elif defined(FOUR_IM_TABLES) -+ im_tab[0][b] = w; -+ im_tab[1][b] = upr(w,1); -+ im_tab[2][b] = upr(w,2); -+ im_tab[3][b] = upr(w,3); -+#endif -+ -+ } -+} -+ -+#endif -+ -+#define no_table(x,box,vf,rf,c) bytes2word( \ -+ box[bval(vf(x,0,c),rf(0,c))], \ -+ box[bval(vf(x,1,c),rf(1,c))], \ -+ box[bval(vf(x,2,c),rf(2,c))], \ -+ box[bval(vf(x,3,c),rf(3,c))]) -+ -+#define one_table(x,op,tab,vf,rf,c) \ -+ ( tab[bval(vf(x,0,c),rf(0,c))] \ -+ ^ op(tab[bval(vf(x,1,c),rf(1,c))],1) \ -+ ^ op(tab[bval(vf(x,2,c),rf(2,c))],2) \ -+ ^ op(tab[bval(vf(x,3,c),rf(3,c))],3)) -+ -+#define four_tables(x,tab,vf,rf,c) \ -+ ( tab[0][bval(vf(x,0,c),rf(0,c))] \ -+ ^ tab[1][bval(vf(x,1,c),rf(1,c))] \ -+ ^ tab[2][bval(vf(x,2,c),rf(2,c))] \ -+ ^ tab[3][bval(vf(x,3,c),rf(3,c))]) -+ -+#define vf1(x,r,c) (x) -+#define rf1(r,c) (r) -+#define rf2(r,c) ((r-c)&3) -+ -+#if defined(FOUR_LR_TABLES) -+#define ls_box(x,c) four_tables(x,fl_tab,vf1,rf2,c) -+#elif defined(ONE_LR_TABLE) -+#define ls_box(x,c) one_table(x,upr,fl_tab,vf1,rf2,c) -+#else -+#define ls_box(x,c) no_table(x,s_box,vf1,rf2,c) -+#endif -+ -+#if defined(FOUR_IM_TABLES) -+#define inv_mcol(x) four_tables(x,im_tab,vf1,rf1,0) -+#elif defined(ONE_IM_TABLE) -+#define inv_mcol(x) one_table(x,upr,im_tab,vf1,rf1,0) -+#else -+#define inv_mcol(x) \ -+ (f9 = (x),f2 = FFmulX(f9), f4 = FFmulX(f2), f8 = FFmulX(f4), f9 ^= f8, \ -+ f2 ^= f4 ^ f8 ^ upr(f2 ^ f9,3) ^ upr(f4 ^ f9,2) ^ upr(f9,1)) -+#endif -+ -+// Subroutine to set the block size (if variable) in bytes, legal -+// values being 16, 24 and 32. -+ -+#if defined(AES_BLOCK_SIZE) -+#define nc (AES_BLOCK_SIZE / 4) -+#else -+#define nc (cx->aes_Ncol) -+ -+void aes_set_blk(aes_context *cx, int n_bytes) -+{ -+#if !defined(FIXED_TABLES) -+ if(!tab_gen) { gen_tabs(); tab_gen = 1; } -+#endif -+ -+ switch(n_bytes) { -+ case 32: /* bytes */ -+ case 256: /* bits */ -+ nc = 8; -+ break; -+ case 24: /* bytes */ -+ case 192: /* bits */ -+ nc = 6; -+ break; -+ case 16: /* bytes */ -+ case 128: /* bits */ -+ default: -+ nc = 4; -+ break; -+ } -+} -+ -+#endif -+ -+// Initialise the key schedule from the user supplied key. The key -+// length is now specified in bytes - 16, 24 or 32 as appropriate. -+// This corresponds to bit lengths of 128, 192 and 256 bits, and -+// to Nk values of 4, 6 and 8 respectively. -+ -+#define mx(t,f) (*t++ = inv_mcol(*f),f++) -+#define cp(t,f) *t++ = *f++ -+ -+#if AES_BLOCK_SIZE == 16 -+#define cpy(d,s) cp(d,s); cp(d,s); cp(d,s); cp(d,s) -+#define mix(d,s) mx(d,s); mx(d,s); mx(d,s); mx(d,s) -+#elif AES_BLOCK_SIZE == 24 -+#define cpy(d,s) cp(d,s); cp(d,s); cp(d,s); cp(d,s); \ -+ cp(d,s); cp(d,s) -+#define mix(d,s) mx(d,s); mx(d,s); mx(d,s); mx(d,s); \ -+ mx(d,s); mx(d,s) -+#elif AES_BLOCK_SIZE == 32 -+#define cpy(d,s) cp(d,s); cp(d,s); cp(d,s); cp(d,s); \ -+ cp(d,s); cp(d,s); cp(d,s); cp(d,s) -+#define mix(d,s) mx(d,s); mx(d,s); mx(d,s); mx(d,s); \ -+ mx(d,s); mx(d,s); mx(d,s); mx(d,s) -+#else -+ -+#define cpy(d,s) \ -+switch(nc) \ -+{ case 8: cp(d,s); cp(d,s); \ -+ case 6: cp(d,s); cp(d,s); \ -+ case 4: cp(d,s); cp(d,s); \ -+ cp(d,s); cp(d,s); \ -+} -+ -+#define mix(d,s) \ -+switch(nc) \ -+{ case 8: mx(d,s); mx(d,s); \ -+ case 6: mx(d,s); mx(d,s); \ -+ case 4: mx(d,s); mx(d,s); \ -+ mx(d,s); mx(d,s); \ -+} -+ -+#endif -+ -+void aes_set_key(aes_context *cx, const unsigned char in_key[], int n_bytes, const int f) -+{ u_int32_t *kf, *kt, rci; -+ -+#if !defined(FIXED_TABLES) -+ if(!tab_gen) { gen_tabs(); tab_gen = 1; } -+#endif -+ -+ switch(n_bytes) { -+ case 32: /* bytes */ -+ case 256: /* bits */ -+ cx->aes_Nkey = 8; -+ break; -+ case 24: /* bytes */ -+ case 192: /* bits */ -+ cx->aes_Nkey = 6; -+ break; -+ case 16: /* bytes */ -+ case 128: /* bits */ -+ default: -+ cx->aes_Nkey = 4; -+ break; -+ } -+ -+ cx->aes_Nrnd = (cx->aes_Nkey > nc ? cx->aes_Nkey : nc) + 6; -+ -+ cx->aes_e_key[0] = word_in(in_key ); -+ cx->aes_e_key[1] = word_in(in_key + 4); -+ cx->aes_e_key[2] = word_in(in_key + 8); -+ cx->aes_e_key[3] = word_in(in_key + 12); -+ -+ kf = cx->aes_e_key; -+ kt = kf + nc * (cx->aes_Nrnd + 1) - cx->aes_Nkey; -+ rci = 0; -+ -+ switch(cx->aes_Nkey) -+ { -+ case 4: do -+ { kf[4] = kf[0] ^ ls_box(kf[3],3) ^ rcon_tab[rci++]; -+ kf[5] = kf[1] ^ kf[4]; -+ kf[6] = kf[2] ^ kf[5]; -+ kf[7] = kf[3] ^ kf[6]; -+ kf += 4; -+ } -+ while(kf < kt); -+ break; -+ -+ case 6: cx->aes_e_key[4] = word_in(in_key + 16); -+ cx->aes_e_key[5] = word_in(in_key + 20); -+ do -+ { kf[ 6] = kf[0] ^ ls_box(kf[5],3) ^ rcon_tab[rci++]; -+ kf[ 7] = kf[1] ^ kf[ 6]; -+ kf[ 8] = kf[2] ^ kf[ 7]; -+ kf[ 9] = kf[3] ^ kf[ 8]; -+ kf[10] = kf[4] ^ kf[ 9]; -+ kf[11] = kf[5] ^ kf[10]; -+ kf += 6; -+ } -+ while(kf < kt); -+ break; -+ -+ case 8: cx->aes_e_key[4] = word_in(in_key + 16); -+ cx->aes_e_key[5] = word_in(in_key + 20); -+ cx->aes_e_key[6] = word_in(in_key + 24); -+ cx->aes_e_key[7] = word_in(in_key + 28); -+ do -+ { kf[ 8] = kf[0] ^ ls_box(kf[7],3) ^ rcon_tab[rci++]; -+ kf[ 9] = kf[1] ^ kf[ 8]; -+ kf[10] = kf[2] ^ kf[ 9]; -+ kf[11] = kf[3] ^ kf[10]; -+ kf[12] = kf[4] ^ ls_box(kf[11],0); -+ kf[13] = kf[5] ^ kf[12]; -+ kf[14] = kf[6] ^ kf[13]; -+ kf[15] = kf[7] ^ kf[14]; -+ kf += 8; -+ } -+ while (kf < kt); -+ break; -+ } -+ -+ if(!f) -+ { u_int32_t i; -+ -+ kt = cx->aes_d_key + nc * cx->aes_Nrnd; -+ kf = cx->aes_e_key; -+ -+ cpy(kt, kf); kt -= 2 * nc; -+ -+ for(i = 1; i < cx->aes_Nrnd; ++i) -+ { -+#if defined(ONE_TABLE) || defined(FOUR_TABLES) -+#if !defined(ONE_IM_TABLE) && !defined(FOUR_IM_TABLES) -+ u_int32_t f2, f4, f8, f9; -+#endif -+ mix(kt, kf); -+#else -+ cpy(kt, kf); -+#endif -+ kt -= 2 * nc; -+ } -+ -+ cpy(kt, kf); -+ } -+} -+ -+// y = output word, x = input word, r = row, c = column -+// for r = 0, 1, 2 and 3 = column accessed for row r -+ -+#if defined(ARRAYS) -+#define s(x,c) x[c] -+#else -+#define s(x,c) x##c -+#endif -+ -+// I am grateful to Frank Yellin for the following constructions -+// which, given the column (c) of the output state variable that -+// is being computed, return the input state variables which are -+// needed for each row (r) of the state -+ -+// For the fixed block size options, compilers reduce these two -+// expressions to fixed variable references. For variable block -+// size code conditional clauses will sometimes be returned -+ -+#define unused 77 // Sunset Strip -+ -+#define fwd_var(x,r,c) \ -+ ( r==0 ? \ -+ ( c==0 ? s(x,0) \ -+ : c==1 ? s(x,1) \ -+ : c==2 ? s(x,2) \ -+ : c==3 ? s(x,3) \ -+ : c==4 ? s(x,4) \ -+ : c==5 ? s(x,5) \ -+ : c==6 ? s(x,6) \ -+ : s(x,7)) \ -+ : r==1 ? \ -+ ( c==0 ? s(x,1) \ -+ : c==1 ? s(x,2) \ -+ : c==2 ? s(x,3) \ -+ : c==3 ? nc==4 ? s(x,0) : s(x,4) \ -+ : c==4 ? s(x,5) \ -+ : c==5 ? nc==8 ? s(x,6) : s(x,0) \ -+ : c==6 ? s(x,7) \ -+ : s(x,0)) \ -+ : r==2 ? \ -+ ( c==0 ? nc==8 ? s(x,3) : s(x,2) \ -+ : c==1 ? nc==8 ? s(x,4) : s(x,3) \ -+ : c==2 ? nc==4 ? s(x,0) : nc==8 ? s(x,5) : s(x,4) \ -+ : c==3 ? nc==4 ? s(x,1) : nc==8 ? s(x,6) : s(x,5) \ -+ : c==4 ? nc==8 ? s(x,7) : s(x,0) \ -+ : c==5 ? nc==8 ? s(x,0) : s(x,1) \ -+ : c==6 ? s(x,1) \ -+ : s(x,2)) \ -+ : \ -+ ( c==0 ? nc==8 ? s(x,4) : s(x,3) \ -+ : c==1 ? nc==4 ? s(x,0) : nc==8 ? s(x,5) : s(x,4) \ -+ : c==2 ? nc==4 ? s(x,1) : nc==8 ? s(x,6) : s(x,5) \ -+ : c==3 ? nc==4 ? s(x,2) : nc==8 ? s(x,7) : s(x,0) \ -+ : c==4 ? nc==8 ? s(x,0) : s(x,1) \ -+ : c==5 ? nc==8 ? s(x,1) : s(x,2) \ -+ : c==6 ? s(x,2) \ -+ : s(x,3))) -+ -+#define inv_var(x,r,c) \ -+ ( r==0 ? \ -+ ( c==0 ? s(x,0) \ -+ : c==1 ? s(x,1) \ -+ : c==2 ? s(x,2) \ -+ : c==3 ? s(x,3) \ -+ : c==4 ? s(x,4) \ -+ : c==5 ? s(x,5) \ -+ : c==6 ? s(x,6) \ -+ : s(x,7)) \ -+ : r==1 ? \ -+ ( c==0 ? nc==4 ? s(x,3) : nc==8 ? s(x,7) : s(x,5) \ -+ : c==1 ? s(x,0) \ -+ : c==2 ? s(x,1) \ -+ : c==3 ? s(x,2) \ -+ : c==4 ? s(x,3) \ -+ : c==5 ? s(x,4) \ -+ : c==6 ? s(x,5) \ -+ : s(x,6)) \ -+ : r==2 ? \ -+ ( c==0 ? nc==4 ? s(x,2) : nc==8 ? s(x,5) : s(x,4) \ -+ : c==1 ? nc==4 ? s(x,3) : nc==8 ? s(x,6) : s(x,5) \ -+ : c==2 ? nc==8 ? s(x,7) : s(x,0) \ -+ : c==3 ? nc==8 ? s(x,0) : s(x,1) \ -+ : c==4 ? nc==8 ? s(x,1) : s(x,2) \ -+ : c==5 ? nc==8 ? s(x,2) : s(x,3) \ -+ : c==6 ? s(x,3) \ -+ : s(x,4)) \ -+ : \ -+ ( c==0 ? nc==4 ? s(x,1) : nc==8 ? s(x,4) : s(x,3) \ -+ : c==1 ? nc==4 ? s(x,2) : nc==8 ? s(x,5) : s(x,4) \ -+ : c==2 ? nc==4 ? s(x,3) : nc==8 ? s(x,6) : s(x,5) \ -+ : c==3 ? nc==8 ? s(x,7) : s(x,0) \ -+ : c==4 ? nc==8 ? s(x,0) : s(x,1) \ -+ : c==5 ? nc==8 ? s(x,1) : s(x,2) \ -+ : c==6 ? s(x,2) \ -+ : s(x,3))) -+ -+#define si(y,x,k,c) s(y,c) = word_in(x + 4 * c) ^ k[c] -+#define so(y,x,c) word_out(y + 4 * c, s(x,c)) -+ -+#if defined(FOUR_TABLES) -+#define fwd_rnd(y,x,k,c) s(y,c)= (k)[c] ^ four_tables(x,ft_tab,fwd_var,rf1,c) -+#define inv_rnd(y,x,k,c) s(y,c)= (k)[c] ^ four_tables(x,it_tab,inv_var,rf1,c) -+#elif defined(ONE_TABLE) -+#define fwd_rnd(y,x,k,c) s(y,c)= (k)[c] ^ one_table(x,upr,ft_tab,fwd_var,rf1,c) -+#define inv_rnd(y,x,k,c) s(y,c)= (k)[c] ^ one_table(x,upr,it_tab,inv_var,rf1,c) -+#else -+#define fwd_rnd(y,x,k,c) s(y,c) = fwd_mcol(no_table(x,s_box,fwd_var,rf1,c)) ^ (k)[c] -+#define inv_rnd(y,x,k,c) s(y,c) = inv_mcol(no_table(x,inv_s_box,inv_var,rf1,c) ^ (k)[c]) -+#endif -+ -+#if defined(FOUR_LR_TABLES) -+#define fwd_lrnd(y,x,k,c) s(y,c)= (k)[c] ^ four_tables(x,fl_tab,fwd_var,rf1,c) -+#define inv_lrnd(y,x,k,c) s(y,c)= (k)[c] ^ four_tables(x,il_tab,inv_var,rf1,c) -+#elif defined(ONE_LR_TABLE) -+#define fwd_lrnd(y,x,k,c) s(y,c)= (k)[c] ^ one_table(x,ups,fl_tab,fwd_var,rf1,c) -+#define inv_lrnd(y,x,k,c) s(y,c)= (k)[c] ^ one_table(x,ups,il_tab,inv_var,rf1,c) -+#else -+#define fwd_lrnd(y,x,k,c) s(y,c) = no_table(x,s_box,fwd_var,rf1,c) ^ (k)[c] -+#define inv_lrnd(y,x,k,c) s(y,c) = no_table(x,inv_s_box,inv_var,rf1,c) ^ (k)[c] -+#endif -+ -+#if AES_BLOCK_SIZE == 16 -+ -+#if defined(ARRAYS) -+#define locals(y,x) x[4],y[4] -+#else -+#define locals(y,x) x##0,x##1,x##2,x##3,y##0,y##1,y##2,y##3 -+// the following defines prevent the compiler requiring the declaration -+// of generated but unused variables in the fwd_var and inv_var macros -+#define b04 unused -+#define b05 unused -+#define b06 unused -+#define b07 unused -+#define b14 unused -+#define b15 unused -+#define b16 unused -+#define b17 unused -+#endif -+#define l_copy(y, x) s(y,0) = s(x,0); s(y,1) = s(x,1); \ -+ s(y,2) = s(x,2); s(y,3) = s(x,3); -+#define state_in(y,x,k) si(y,x,k,0); si(y,x,k,1); si(y,x,k,2); si(y,x,k,3) -+#define state_out(y,x) so(y,x,0); so(y,x,1); so(y,x,2); so(y,x,3) -+#define round(rm,y,x,k) rm(y,x,k,0); rm(y,x,k,1); rm(y,x,k,2); rm(y,x,k,3) -+ -+#elif AES_BLOCK_SIZE == 24 -+ -+#if defined(ARRAYS) -+#define locals(y,x) x[6],y[6] -+#else -+#define locals(y,x) x##0,x##1,x##2,x##3,x##4,x##5, \ -+ y##0,y##1,y##2,y##3,y##4,y##5 -+#define b06 unused -+#define b07 unused -+#define b16 unused -+#define b17 unused -+#endif -+#define l_copy(y, x) s(y,0) = s(x,0); s(y,1) = s(x,1); \ -+ s(y,2) = s(x,2); s(y,3) = s(x,3); \ -+ s(y,4) = s(x,4); s(y,5) = s(x,5); -+#define state_in(y,x,k) si(y,x,k,0); si(y,x,k,1); si(y,x,k,2); \ -+ si(y,x,k,3); si(y,x,k,4); si(y,x,k,5) -+#define state_out(y,x) so(y,x,0); so(y,x,1); so(y,x,2); \ -+ so(y,x,3); so(y,x,4); so(y,x,5) -+#define round(rm,y,x,k) rm(y,x,k,0); rm(y,x,k,1); rm(y,x,k,2); \ -+ rm(y,x,k,3); rm(y,x,k,4); rm(y,x,k,5) -+#else -+ -+#if defined(ARRAYS) -+#define locals(y,x) x[8],y[8] -+#else -+#define locals(y,x) x##0,x##1,x##2,x##3,x##4,x##5,x##6,x##7, \ -+ y##0,y##1,y##2,y##3,y##4,y##5,y##6,y##7 -+#endif -+#define l_copy(y, x) s(y,0) = s(x,0); s(y,1) = s(x,1); \ -+ s(y,2) = s(x,2); s(y,3) = s(x,3); \ -+ s(y,4) = s(x,4); s(y,5) = s(x,5); \ -+ s(y,6) = s(x,6); s(y,7) = s(x,7); -+ -+#if AES_BLOCK_SIZE == 32 -+ -+#define state_in(y,x,k) si(y,x,k,0); si(y,x,k,1); si(y,x,k,2); si(y,x,k,3); \ -+ si(y,x,k,4); si(y,x,k,5); si(y,x,k,6); si(y,x,k,7) -+#define state_out(y,x) so(y,x,0); so(y,x,1); so(y,x,2); so(y,x,3); \ -+ so(y,x,4); so(y,x,5); so(y,x,6); so(y,x,7) -+#define round(rm,y,x,k) rm(y,x,k,0); rm(y,x,k,1); rm(y,x,k,2); rm(y,x,k,3); \ -+ rm(y,x,k,4); rm(y,x,k,5); rm(y,x,k,6); rm(y,x,k,7) -+#else -+ -+#define state_in(y,x,k) \ -+switch(nc) \ -+{ case 8: si(y,x,k,7); si(y,x,k,6); \ -+ case 6: si(y,x,k,5); si(y,x,k,4); \ -+ case 4: si(y,x,k,3); si(y,x,k,2); \ -+ si(y,x,k,1); si(y,x,k,0); \ -+} -+ -+#define state_out(y,x) \ -+switch(nc) \ -+{ case 8: so(y,x,7); so(y,x,6); \ -+ case 6: so(y,x,5); so(y,x,4); \ -+ case 4: so(y,x,3); so(y,x,2); \ -+ so(y,x,1); so(y,x,0); \ -+} -+ -+#if defined(FAST_VARIABLE) -+ -+#define round(rm,y,x,k) \ -+switch(nc) \ -+{ case 8: rm(y,x,k,7); rm(y,x,k,6); \ -+ rm(y,x,k,5); rm(y,x,k,4); \ -+ rm(y,x,k,3); rm(y,x,k,2); \ -+ rm(y,x,k,1); rm(y,x,k,0); \ -+ break; \ -+ case 6: rm(y,x,k,5); rm(y,x,k,4); \ -+ rm(y,x,k,3); rm(y,x,k,2); \ -+ rm(y,x,k,1); rm(y,x,k,0); \ -+ break; \ -+ case 4: rm(y,x,k,3); rm(y,x,k,2); \ -+ rm(y,x,k,1); rm(y,x,k,0); \ -+ break; \ -+} -+#else -+ -+#define round(rm,y,x,k) \ -+switch(nc) \ -+{ case 8: rm(y,x,k,7); rm(y,x,k,6); \ -+ case 6: rm(y,x,k,5); rm(y,x,k,4); \ -+ case 4: rm(y,x,k,3); rm(y,x,k,2); \ -+ rm(y,x,k,1); rm(y,x,k,0); \ -+} -+ -+#endif -+ -+#endif -+#endif -+ -+void aes_encrypt(const aes_context *cx, const unsigned char in_blk[], unsigned char out_blk[]) -+{ u_int32_t locals(b0, b1); -+ const u_int32_t *kp = cx->aes_e_key; -+ -+#if !defined(ONE_TABLE) && !defined(FOUR_TABLES) -+ u_int32_t f2; -+#endif -+ -+ state_in(b0, in_blk, kp); kp += nc; -+ -+#if defined(UNROLL) -+ -+ switch(cx->aes_Nrnd) -+ { -+ case 14: round(fwd_rnd, b1, b0, kp ); -+ round(fwd_rnd, b0, b1, kp + nc ); kp += 2 * nc; -+ case 12: round(fwd_rnd, b1, b0, kp ); -+ round(fwd_rnd, b0, b1, kp + nc ); kp += 2 * nc; -+ case 10: round(fwd_rnd, b1, b0, kp ); -+ round(fwd_rnd, b0, b1, kp + nc); -+ round(fwd_rnd, b1, b0, kp + 2 * nc); -+ round(fwd_rnd, b0, b1, kp + 3 * nc); -+ round(fwd_rnd, b1, b0, kp + 4 * nc); -+ round(fwd_rnd, b0, b1, kp + 5 * nc); -+ round(fwd_rnd, b1, b0, kp + 6 * nc); -+ round(fwd_rnd, b0, b1, kp + 7 * nc); -+ round(fwd_rnd, b1, b0, kp + 8 * nc); -+ round(fwd_lrnd, b0, b1, kp + 9 * nc); -+ } -+ -+#elif defined(PARTIAL_UNROLL) -+ { u_int32_t rnd; -+ -+ for(rnd = 0; rnd < (cx->aes_Nrnd >> 1) - 1; ++rnd) -+ { -+ round(fwd_rnd, b1, b0, kp); -+ round(fwd_rnd, b0, b1, kp + nc); kp += 2 * nc; -+ } -+ -+ round(fwd_rnd, b1, b0, kp); -+ round(fwd_lrnd, b0, b1, kp + nc); -+ } -+#else -+ { u_int32_t rnd; -+ -+ for(rnd = 0; rnd < cx->aes_Nrnd - 1; ++rnd) -+ { -+ round(fwd_rnd, b1, b0, kp); -+ l_copy(b0, b1); kp += nc; -+ } -+ -+ round(fwd_lrnd, b0, b1, kp); -+ } -+#endif -+ -+ state_out(out_blk, b0); -+} -+ -+void aes_decrypt(const aes_context *cx, const unsigned char in_blk[], unsigned char out_blk[]) -+{ u_int32_t locals(b0, b1); -+ const u_int32_t *kp = cx->aes_d_key; -+ -+#if !defined(ONE_TABLE) && !defined(FOUR_TABLES) -+ u_int32_t f2, f4, f8, f9; -+#endif -+ -+ state_in(b0, in_blk, kp); kp += nc; -+ -+#if defined(UNROLL) -+ -+ switch(cx->aes_Nrnd) -+ { -+ case 14: round(inv_rnd, b1, b0, kp ); -+ round(inv_rnd, b0, b1, kp + nc ); kp += 2 * nc; -+ case 12: round(inv_rnd, b1, b0, kp ); -+ round(inv_rnd, b0, b1, kp + nc ); kp += 2 * nc; -+ case 10: round(inv_rnd, b1, b0, kp ); -+ round(inv_rnd, b0, b1, kp + nc); -+ round(inv_rnd, b1, b0, kp + 2 * nc); -+ round(inv_rnd, b0, b1, kp + 3 * nc); -+ round(inv_rnd, b1, b0, kp + 4 * nc); -+ round(inv_rnd, b0, b1, kp + 5 * nc); -+ round(inv_rnd, b1, b0, kp + 6 * nc); -+ round(inv_rnd, b0, b1, kp + 7 * nc); -+ round(inv_rnd, b1, b0, kp + 8 * nc); -+ round(inv_lrnd, b0, b1, kp + 9 * nc); -+ } -+ -+#elif defined(PARTIAL_UNROLL) -+ { u_int32_t rnd; -+ -+ for(rnd = 0; rnd < (cx->aes_Nrnd >> 1) - 1; ++rnd) -+ { -+ round(inv_rnd, b1, b0, kp); -+ round(inv_rnd, b0, b1, kp + nc); kp += 2 * nc; -+ } -+ -+ round(inv_rnd, b1, b0, kp); -+ round(inv_lrnd, b0, b1, kp + nc); -+ } -+#else -+ { u_int32_t rnd; -+ -+ for(rnd = 0; rnd < cx->aes_Nrnd - 1; ++rnd) -+ { -+ round(inv_rnd, b1, b0, kp); -+ l_copy(b0, b1); kp += nc; -+ } -+ -+ round(inv_lrnd, b0, b1, kp); -+ } -+#endif -+ -+ state_out(out_blk, b0); -+} -diff -urN linux-3.10-noloop/drivers/misc/aes.h linux-3.10-AES/drivers/misc/aes.h ---- linux-3.10-noloop/drivers/misc/aes.h 1970-01-01 02:00:00.000000000 +0200 -+++ linux-3.10-AES/drivers/misc/aes.h 2013-07-01 16:12:48.000000000 +0300 -@@ -0,0 +1,112 @@ -+// I retain copyright in this code but I encourage its free use provided -+// that I don't carry any responsibility for the results. I am especially -+// happy to see it used in free and open source software. If you do use -+// it I would appreciate an acknowledgement of its origin in the code or -+// the product that results and I would also appreciate knowing a little -+// about the use to which it is being put. I am grateful to Frank Yellin -+// for some ideas that are used in this implementation. -+// -+// Dr B. R. Gladman 6th April 2001. -+// -+// This is an implementation of the AES encryption algorithm (Rijndael) -+// designed by Joan Daemen and Vincent Rijmen. This version is designed -+// to provide both fixed and dynamic block and key lengths and can also -+// run with either big or little endian internal byte order (see aes.h). -+// It inputs block and key lengths in bytes with the legal values being -+// 16, 24 and 32. -+ -+/* -+ * Modified by Jari Ruusu, May 1 2001 -+ * - Fixed some compile warnings, code was ok but gcc warned anyway. -+ * - Changed basic types: byte -> unsigned char, word -> u_int32_t -+ * - Major name space cleanup: Names visible to outside now begin -+ * with "aes_" or "AES_". A lot of stuff moved from aes.h to aes.c -+ * - Removed C++ and DLL support as part of name space cleanup. -+ * - Eliminated unnecessary recomputation of tables. (actual bug fix) -+ * - Merged precomputed constant tables to aes.c file. -+ * - Removed data alignment restrictions for portability reasons. -+ * - Made block and key lengths accept bit count (128/192/256) -+ * as well byte count (16/24/32). -+ * - Removed all error checks. This change also eliminated the need -+ * to preinitialize the context struct to zero. -+ * - Removed some totally unused constants. -+ */ -+ -+#ifndef _AES_H -+#define _AES_H -+ -+#include -+#include -+#include -+ -+// CONFIGURATION OPTIONS (see also aes.c) -+// -+// Define AES_BLOCK_SIZE to set the cipher block size (16, 24 or 32) or -+// leave this undefined for dynamically variable block size (this will -+// result in much slower code). -+// IMPORTANT NOTE: AES_BLOCK_SIZE is in BYTES (16, 24, 32 or undefined). If -+// left undefined a slower version providing variable block length is compiled -+ -+#define AES_BLOCK_SIZE 16 -+ -+// The number of key schedule words for different block and key lengths -+// allowing for method of computation which requires the length to be a -+// multiple of the key length -+// -+// Nk = 4 6 8 -+// ------------- -+// Nb = 4 | 60 60 64 -+// 6 | 96 90 96 -+// 8 | 120 120 120 -+ -+#if !defined(AES_BLOCK_SIZE) || (AES_BLOCK_SIZE == 32) -+#define AES_KS_LENGTH 120 -+#define AES_RC_LENGTH 29 -+#else -+#define AES_KS_LENGTH 4 * AES_BLOCK_SIZE -+#define AES_RC_LENGTH (9 * AES_BLOCK_SIZE) / 8 - 8 -+#endif -+ -+typedef struct -+{ -+ u_int32_t aes_Nkey; // the number of words in the key input block -+ u_int32_t aes_Nrnd; // the number of cipher rounds -+ u_int32_t aes_e_key[AES_KS_LENGTH]; // the encryption key schedule -+ u_int32_t aes_d_key[AES_KS_LENGTH]; // the decryption key schedule -+#if !defined(AES_BLOCK_SIZE) -+ u_int32_t aes_Ncol; // the number of columns in the cipher state -+#endif -+} aes_context; -+ -+// avoid global name conflict with mainline kernel -+#define aes_set_key _aes_set_key -+#define aes_encrypt _aes_encrypt -+#define aes_decrypt _aes_decrypt -+ -+// THE CIPHER INTERFACE -+ -+#if !defined(AES_BLOCK_SIZE) -+extern void aes_set_blk(aes_context *, const int); -+#endif -+ -+#if defined(CONFIG_X86) || defined(CONFIG_X86_64) -+ asmlinkage -+#endif -+extern void aes_set_key(aes_context *, const unsigned char [], const int, const int); -+ -+#if defined(CONFIG_X86) || defined(CONFIG_X86_64) -+ asmlinkage -+#endif -+extern void aes_encrypt(const aes_context *, const unsigned char [], unsigned char []); -+ -+#if defined(CONFIG_X86) || defined(CONFIG_X86_64) -+ asmlinkage -+#endif -+extern void aes_decrypt(const aes_context *, const unsigned char [], unsigned char []); -+ -+// The block length inputs to aes_set_block and aes_set_key are in numbers -+// of bytes or bits. The calls to subroutines must be made in the above -+// order but multiple calls can be made without repeating earlier calls -+// if their parameters have not changed. -+ -+#endif // _AES_H -diff -urN linux-3.10-noloop/drivers/misc/crypto-ksym.c linux-3.10-AES/drivers/misc/crypto-ksym.c ---- linux-3.10-noloop/drivers/misc/crypto-ksym.c 1970-01-01 02:00:00.000000000 +0200 -+++ linux-3.10-AES/drivers/misc/crypto-ksym.c 2013-07-01 16:12:48.000000000 +0300 -@@ -0,0 +1,18 @@ -+#include -+#include "aes.h" -+#include "md5.h" -+EXPORT_SYMBOL(aes_set_key); -+EXPORT_SYMBOL(aes_encrypt); -+EXPORT_SYMBOL(aes_decrypt); -+EXPORT_SYMBOL(md5_transform_CPUbyteorder); -+#if defined(CONFIG_X86_64) -+EXPORT_SYMBOL(md5_transform_CPUbyteorder_2x); -+#endif -+#if defined(CONFIG_BLK_DEV_LOOP_INTELAES) && (defined(CONFIG_X86) || defined(CONFIG_X86_64)) -+asmlinkage extern void intel_aes_cbc_encrypt(const aes_context *, void *src, void *dst, size_t len, void *iv); -+asmlinkage extern void intel_aes_cbc_decrypt(const aes_context *, void *src, void *dst, size_t len, void *iv); -+asmlinkage extern void intel_aes_cbc_enc_4x512(aes_context **, void *src, void *dst, void *iv); -+EXPORT_SYMBOL(intel_aes_cbc_encrypt); -+EXPORT_SYMBOL(intel_aes_cbc_decrypt); -+EXPORT_SYMBOL(intel_aes_cbc_enc_4x512); -+#endif -diff -urN linux-3.10-noloop/drivers/misc/md5-2x-amd64.S linux-3.10-AES/drivers/misc/md5-2x-amd64.S ---- linux-3.10-noloop/drivers/misc/md5-2x-amd64.S 1970-01-01 02:00:00.000000000 +0200 -+++ linux-3.10-AES/drivers/misc/md5-2x-amd64.S 2013-07-01 16:12:48.000000000 +0300 -@@ -0,0 +1,223 @@ -+// -+// md5-2x-amd64.S -+// -+// Written by Jari Ruusu, October 1 2003 -+// -+// Copyright 2003 by Jari Ruusu. -+// Redistribution of this file is permitted under the GNU Public License. -+// -+ -+// Modified by Jari Ruusu, June 12 2004 -+// - Converted 32 bit x86 code to 64 bit AMD64 code -+ -+// Modified by Jari Ruusu, April 11 2010 -+// - Added another parallel MD5 transform computation -+ -+// A MD5 transform implementation for AMD64 compatible processors. -+// This code does not preserve the rax, rcx, rdx, rsi, rdi or r8-r11 -+// registers or the artihmetic status flags. However, the rbx, rbp and -+// r12-r15 registers are preserved across calls. -+ -+// void md5_transform_CPUbyteorder_2x(u_int32_t *hashAB, u_int32_t *inA, u_int32_t *inB) -+ -+#if defined(USE_UNDERLINE) -+# define md5_transform_CPUbyteorder_2x _md5_transform_CPUbyteorder_2x -+#endif -+#if !defined(ALIGN64BYTES) -+# define ALIGN64BYTES 64 -+#endif -+ -+ .file "md5-2x-amd64.S" -+ .globl md5_transform_CPUbyteorder_2x -+ -+// rdi = pointer to u_int32_t hash[4 + 4] array which is read and written -+// hash[0...3] are for first MD5, hash[4...7] are for second MD5 -+// rsi = pointer to u_int32_t in[16] array, first MD5, read only -+// rdx = pointer to u_int32_t in[16] array, second MD5, read only -+ -+ .text -+ .align ALIGN64BYTES -+md5_transform_CPUbyteorder_2x: -+ push %rbx -+ push %rbp -+ push %r12 -+ push %r13 -+ push %r14 -+ push %r15 -+ -+ movl 12(%rdi),%eax ; movl 12+16(%rdi),%ebx -+ movl 8(%rdi),%ecx ; movl 8+16(%rdi),%r13d -+ movl (%rdi),%r8d ; movl 16(%rdi),%r11d -+ movl 4(%rdi),%r9d ; movl 4+16(%rdi),%r12d -+ movl (%rsi),%r10d ; movl (%rdx),%ebp -+ prefetcht0 60(%rsi) ; prefetcht0 60(%rdx) -+ movl %eax,%r15d ; movl %ebx,%r14d -+ xorl %ecx,%eax ; xorl %r13d,%ebx -+ -+#define REPEAT1(p1Aw,p1Bw,p2Ax,p2Bx,p3Az,p3Bz,p4c,p5s,p6Nin,p7ANz,p7BNz,p8ANy,p8BNy) \ -+ addl $p4c,p1Aw ; addl $p4c,p1Bw ;\ -+ andl p2Ax,%eax ; andl p2Bx,%ebx ;\ -+ addl %r10d,p1Aw ; addl %ebp,p1Bw ;\ -+ xorl p3Az,%eax ; xorl p3Bz,%ebx ;\ -+ movl p6Nin*4(%rsi),%r10d ; movl p6Nin*4(%rdx),%ebp ;\ -+ addl %eax,p1Aw ; addl %ebx,p1Bw ;\ -+ movl p7ANz,%eax ; movl p7BNz,%ebx ;\ -+ roll $p5s,p1Aw ; roll $p5s,p1Bw ;\ -+ xorl p8ANy,%eax ; xorl p8BNy,%ebx ;\ -+ addl p2Ax,p1Aw ; addl p2Bx,p1Bw -+ -+ REPEAT1(%r8d,%r11d,%r9d,%r12d,%r15d,%r14d,0xd76aa478,7,1,%ecx,%r13d,%r9d,%r12d) -+ REPEAT1(%r15d,%r14d,%r8d,%r11d,%ecx,%r13d,0xe8c7b756,12,2,%r9d,%r12d,%r8d,%r11d) -+ REPEAT1(%ecx,%r13d,%r15d,%r14d,%r9d,%r12d,0x242070db,17,3,%r8d,%r11d,%r15d,%r14d) -+ REPEAT1(%r9d,%r12d,%ecx,%r13d,%r8d,%r11d,0xc1bdceee,22,4,%r15d,%r14d,%ecx,%r13d) -+ REPEAT1(%r8d,%r11d,%r9d,%r12d,%r15d,%r14d,0xf57c0faf,7,5,%ecx,%r13d,%r9d,%r12d) -+ REPEAT1(%r15d,%r14d,%r8d,%r11d,%ecx,%r13d,0x4787c62a,12,6,%r9d,%r12d,%r8d,%r11d) -+ REPEAT1(%ecx,%r13d,%r15d,%r14d,%r9d,%r12d,0xa8304613,17,7,%r8d,%r11d,%r15d,%r14d) -+ REPEAT1(%r9d,%r12d,%ecx,%r13d,%r8d,%r11d,0xfd469501,22,8,%r15d,%r14d,%ecx,%r13d) -+ REPEAT1(%r8d,%r11d,%r9d,%r12d,%r15d,%r14d,0x698098d8,7,9,%ecx,%r13d,%r9d,%r12d) -+ REPEAT1(%r15d,%r14d,%r8d,%r11d,%ecx,%r13d,0x8b44f7af,12,10,%r9d,%r12d,%r8d,%r11d) -+ REPEAT1(%ecx,%r13d,%r15d,%r14d,%r9d,%r12d,0xffff5bb1,17,11,%r8d,%r11d,%r15d,%r14d) -+ REPEAT1(%r9d,%r12d,%ecx,%r13d,%r8d,%r11d,0x895cd7be,22,12,%r15d,%r14d,%ecx,%r13d) -+ REPEAT1(%r8d,%r11d,%r9d,%r12d,%r15d,%r14d,0x6b901122,7,13,%ecx,%r13d,%r9d,%r12d) -+ REPEAT1(%r15d,%r14d,%r8d,%r11d,%ecx,%r13d,0xfd987193,12,14,%r9d,%r12d,%r8d,%r11d) -+ REPEAT1(%ecx,%r13d,%r15d,%r14d,%r9d,%r12d,0xa679438e,17,15,%r8d,%r11d,%r15d,%r14d) -+ -+ addl $0x49b40821,%r9d ; addl $0x49b40821,%r12d -+ andl %ecx,%eax ; andl %r13d,%ebx -+ addl %r10d,%r9d ; addl %ebp,%r12d -+ xorl %r8d,%eax ; xorl %r11d,%ebx -+ movl 1*4(%rsi),%r10d ; movl 1*4(%rdx),%ebp -+ addl %eax,%r9d ; addl %ebx,%r12d -+ movl %ecx,%eax ; movl %r13d,%ebx -+ roll $22,%r9d ; roll $22,%r12d -+ addl %ecx,%r9d ; addl %r13d,%r12d -+ -+#define REPEAT2(p1Aw,p1Bw,p2Ax,p2Bx,p3Ay,p3By,p4Az,p4Bz,p5c,p6s,p7Nin,p8ANy,p8BNy) \ -+ xorl p2Ax,%eax ; xorl p2Bx,%ebx ;\ -+ addl $p5c,p1Aw ; addl $p5c,p1Bw ;\ -+ andl p4Az,%eax ; andl p4Bz,%ebx ;\ -+ addl %r10d,p1Aw ; addl %ebp,p1Bw ;\ -+ xorl p3Ay,%eax ; xorl p3By,%ebx ;\ -+ movl p7Nin*4(%rsi),%r10d ; movl p7Nin*4(%rdx),%ebp ;\ -+ addl %eax,p1Aw ; addl %ebx,p1Bw ;\ -+ movl p8ANy,%eax ; movl p8BNy,%ebx ;\ -+ roll $p6s,p1Aw ; roll $p6s,p1Bw ;\ -+ addl p2Ax,p1Aw ; addl p2Bx,p1Bw -+ -+ REPEAT2(%r8d,%r11d,%r9d,%r12d,%ecx,%r13d,%r15d,%r14d,0xf61e2562,5,6,%r9d,%r12d) -+ REPEAT2(%r15d,%r14d,%r8d,%r11d,%r9d,%r12d,%ecx,%r13d,0xc040b340,9,11,%r8d,%r11d) -+ REPEAT2(%ecx,%r13d,%r15d,%r14d,%r8d,%r11d,%r9d,%r12d,0x265e5a51,14,0,%r15d,%r14d) -+ REPEAT2(%r9d,%r12d,%ecx,%r13d,%r15d,%r14d,%r8d,%r11d,0xe9b6c7aa,20,5,%ecx,%r13d) -+ REPEAT2(%r8d,%r11d,%r9d,%r12d,%ecx,%r13d,%r15d,%r14d,0xd62f105d,5,10,%r9d,%r12d) -+ REPEAT2(%r15d,%r14d,%r8d,%r11d,%r9d,%r12d,%ecx,%r13d,0x02441453,9,15,%r8d,%r11d) -+ REPEAT2(%ecx,%r13d,%r15d,%r14d,%r8d,%r11d,%r9d,%r12d,0xd8a1e681,14,4,%r15d,%r14d) -+ REPEAT2(%r9d,%r12d,%ecx,%r13d,%r15d,%r14d,%r8d,%r11d,0xe7d3fbc8,20,9,%ecx,%r13d) -+ REPEAT2(%r8d,%r11d,%r9d,%r12d,%ecx,%r13d,%r15d,%r14d,0x21e1cde6,5,14,%r9d,%r12d) -+ REPEAT2(%r15d,%r14d,%r8d,%r11d,%r9d,%r12d,%ecx,%r13d,0xc33707d6,9,3,%r8d,%r11d) -+ REPEAT2(%ecx,%r13d,%r15d,%r14d,%r8d,%r11d,%r9d,%r12d,0xf4d50d87,14,8,%r15d,%r14d) -+ REPEAT2(%r9d,%r12d,%ecx,%r13d,%r15d,%r14d,%r8d,%r11d,0x455a14ed,20,13,%ecx,%r13d) -+ REPEAT2(%r8d,%r11d,%r9d,%r12d,%ecx,%r13d,%r15d,%r14d,0xa9e3e905,5,2,%r9d,%r12d) -+ REPEAT2(%r15d,%r14d,%r8d,%r11d,%r9d,%r12d,%ecx,%r13d,0xfcefa3f8,9,7,%r8d,%r11d) -+ REPEAT2(%ecx,%r13d,%r15d,%r14d,%r8d,%r11d,%r9d,%r12d,0x676f02d9,14,12,%r15d,%r14d) -+ -+ xorl %ecx,%eax ; xorl %r13d,%ebx -+ addl $0x8d2a4c8a,%r9d ; addl $0x8d2a4c8a,%r12d -+ andl %r8d,%eax ; andl %r11d,%ebx -+ addl %r10d,%r9d ; addl %ebp,%r12d -+ xorl %r15d,%eax ; xorl %r14d,%ebx -+ movl 5*4(%rsi),%r10d ; movl 5*4(%rdx),%ebp -+ addl %eax,%r9d ; addl %ebx,%r12d -+ movl %ecx,%eax ; movl %r13d,%ebx -+ roll $20,%r9d ; roll $20,%r12d -+ xorl %r15d,%eax ; xorl %r14d,%ebx -+ addl %ecx,%r9d ; addl %r13d,%r12d -+ -+#define REPEAT3(p1Aw,p1Bw,p2Ax,p2Bx,p3c,p4s,p5Nin,p6ANy,p6BNy,p7ANz,p7BNz) \ -+ addl $p3c,p1Aw ; addl $p3c,p1Bw ;\ -+ xorl p2Ax,%eax ; xorl p2Bx,%ebx ;\ -+ addl %r10d,p1Aw ; addl %ebp,p1Bw ;\ -+ movl p5Nin*4(%rsi),%r10d ; movl p5Nin*4(%rdx),%ebp ;\ -+ addl %eax,p1Aw ; addl %ebx,p1Bw ;\ -+ movl p6ANy,%eax ; movl p6BNy,%ebx ;\ -+ roll $p4s,p1Aw ; roll $p4s,p1Bw ;\ -+ xorl p7ANz,%eax ; xorl p7BNz,%ebx ;\ -+ addl p2Ax,p1Aw ; addl p2Bx,p1Bw -+ -+ REPEAT3(%r8d,%r11d,%r9d,%r12d,0xfffa3942,4,8,%r9d,%r12d,%ecx,%r13d) -+ REPEAT3(%r15d,%r14d,%r8d,%r11d,0x8771f681,11,11,%r8d,%r11d,%r9d,%r12d) -+ REPEAT3(%ecx,%r13d,%r15d,%r14d,0x6d9d6122,16,14,%r15d,%r14d,%r8d,%r11d) -+ REPEAT3(%r9d,%r12d,%ecx,%r13d,0xfde5380c,23,1,%ecx,%r13d,%r15d,%r14d) -+ REPEAT3(%r8d,%r11d,%r9d,%r12d,0xa4beea44,4,4,%r9d,%r12d,%ecx,%r13d) -+ REPEAT3(%r15d,%r14d,%r8d,%r11d,0x4bdecfa9,11,7,%r8d,%r11d,%r9d,%r12d) -+ REPEAT3(%ecx,%r13d,%r15d,%r14d,0xf6bb4b60,16,10,%r15d,%r14d,%r8d,%r11d) -+ REPEAT3(%r9d,%r12d,%ecx,%r13d,0xbebfbc70,23,13,%ecx,%r13d,%r15d,%r14d) -+ REPEAT3(%r8d,%r11d,%r9d,%r12d,0x289b7ec6,4,0,%r9d,%r12d,%ecx,%r13d) -+ REPEAT3(%r15d,%r14d,%r8d,%r11d,0xeaa127fa,11,3,%r8d,%r11d,%r9d,%r12d) -+ REPEAT3(%ecx,%r13d,%r15d,%r14d,0xd4ef3085,16,6,%r15d,%r14d,%r8d,%r11d) -+ REPEAT3(%r9d,%r12d,%ecx,%r13d,0x04881d05,23,9,%ecx,%r13d,%r15d,%r14d) -+ REPEAT3(%r8d,%r11d,%r9d,%r12d,0xd9d4d039,4,12,%r9d,%r12d,%ecx,%r13d) -+ REPEAT3(%r15d,%r14d,%r8d,%r11d,0xe6db99e5,11,15,%r8d,%r11d,%r9d,%r12d) -+ REPEAT3(%ecx,%r13d,%r15d,%r14d,0x1fa27cf8,16,2,%r15d,%r14d,%r8d,%r11d) -+ -+ addl $0xc4ac5665,%r9d ; addl $0xc4ac5665,%r12d -+ xorl %ecx,%eax ; xorl %r13d,%ebx -+ addl %r10d,%r9d ; addl %ebp,%r12d -+ movl (%rsi),%r10d ; movl (%rdx),%ebp -+ addl %eax,%r9d ; addl %ebx,%r12d -+ movl %r15d,%eax ; movl %r14d,%ebx -+ roll $23,%r9d ; roll $23,%r12d -+ notl %eax ; notl %ebx -+ addl %ecx,%r9d ; addl %r13d,%r12d -+ -+#define REPEAT4(p1Aw,p1Bw,p2Ax,p2Bx,p3Ay,p3By,p4c,p5s,p6Nin,p7ANz,p7BNz) \ -+ addl $p4c,p1Aw ; addl $p4c,p1Bw ;\ -+ orl p2Ax,%eax ; orl p2Bx,%ebx ;\ -+ addl %r10d,p1Aw ; addl %ebp,p1Bw ;\ -+ xorl p3Ay,%eax ; xorl p3By,%ebx ;\ -+ movl p6Nin*4(%rsi),%r10d ; movl p6Nin*4(%rdx),%ebp ;\ -+ addl %eax,p1Aw ; addl %ebx,p1Bw ;\ -+ movl p7ANz,%eax ; movl p7BNz,%ebx ;\ -+ roll $p5s,p1Aw ; roll $p5s,p1Bw ;\ -+ notl %eax ; notl %ebx ;\ -+ addl p2Ax,p1Aw ; addl p2Bx,p1Bw -+ -+ REPEAT4(%r8d,%r11d,%r9d,%r12d,%ecx,%r13d,0xf4292244,6,7,%ecx,%r13d) -+ REPEAT4(%r15d,%r14d,%r8d,%r11d,%r9d,%r12d,0x432aff97,10,14,%r9d,%r12d) -+ REPEAT4(%ecx,%r13d,%r15d,%r14d,%r8d,%r11d,0xab9423a7,15,5,%r8d,%r11d) -+ REPEAT4(%r9d,%r12d,%ecx,%r13d,%r15d,%r14d,0xfc93a039,21,12,%r15d,%r14d) -+ REPEAT4(%r8d,%r11d,%r9d,%r12d,%ecx,%r13d,0x655b59c3,6,3,%ecx,%r13d) -+ REPEAT4(%r15d,%r14d,%r8d,%r11d,%r9d,%r12d,0x8f0ccc92,10,10,%r9d,%r12d) -+ REPEAT4(%ecx,%r13d,%r15d,%r14d,%r8d,%r11d,0xffeff47d,15,1,%r8d,%r11d) -+ REPEAT4(%r9d,%r12d,%ecx,%r13d,%r15d,%r14d,0x85845dd1,21,8,%r15d,%r14d) -+ REPEAT4(%r8d,%r11d,%r9d,%r12d,%ecx,%r13d,0x6fa87e4f,6,15,%ecx,%r13d) -+ REPEAT4(%r15d,%r14d,%r8d,%r11d,%r9d,%r12d,0xfe2ce6e0,10,6,%r9d,%r12d) -+ REPEAT4(%ecx,%r13d,%r15d,%r14d,%r8d,%r11d,0xa3014314,15,13,%r8d,%r11d) -+ REPEAT4(%r9d,%r12d,%ecx,%r13d,%r15d,%r14d,0x4e0811a1,21,4,%r15d,%r14d) -+ REPEAT4(%r8d,%r11d,%r9d,%r12d,%ecx,%r13d,0xf7537e82,6,11,%ecx,%r13d) -+ REPEAT4(%r15d,%r14d,%r8d,%r11d,%r9d,%r12d,0xbd3af235,10,2,%r9d,%r12d) -+ REPEAT4(%ecx,%r13d,%r15d,%r14d,%r8d,%r11d,0x2ad7d2bb,15,9,%r8d,%r11d) -+ -+ addl $0xeb86d391,%r9d ; addl $0xeb86d391,%r12d -+ orl %ecx,%eax ; orl %r13d,%ebx -+ addl %r10d,%r9d ; addl %ebp,%r12d -+ xorl %r15d,%eax ; xorl %r14d,%ebx -+ addl %eax,%r9d ; addl %ebx,%r12d -+ roll $21,%r9d ; roll $21,%r12d -+ addl %ecx,%r9d ; addl %r13d,%r12d -+ -+ addl %r8d,(%rdi) ; addl %r11d,16(%rdi) -+ addl %r9d,4(%rdi) ; addl %r12d,4+16(%rdi) -+ addl %ecx,8(%rdi) ; addl %r13d,8+16(%rdi) -+ addl %r15d,12(%rdi) ; addl %r14d,12+16(%rdi) -+ -+ pop %r15 -+ pop %r14 -+ pop %r13 -+ pop %r12 -+ pop %rbp -+ pop %rbx -+ ret -+ -+#if defined(__ELF__) && defined(SECTION_NOTE_GNU_STACK) -+ .section .note.GNU-stack,"",@progbits -+#endif -diff -urN linux-3.10-noloop/drivers/misc/md5-amd64.S linux-3.10-AES/drivers/misc/md5-amd64.S ---- linux-3.10-noloop/drivers/misc/md5-amd64.S 1970-01-01 02:00:00.000000000 +0200 -+++ linux-3.10-AES/drivers/misc/md5-amd64.S 2013-07-01 16:12:48.000000000 +0300 -@@ -0,0 +1,204 @@ -+// -+// md5-amd64.S -+// -+// Written by Jari Ruusu, October 1 2003 -+// -+// Copyright 2003 by Jari Ruusu. -+// Redistribution of this file is permitted under the GNU Public License. -+// -+ -+// Modified by Jari Ruusu, June 12 2004 -+// - Converted 32 bit x86 code to 64 bit AMD64 code -+ -+// A MD5 transform implementation for AMD64 compatible processors. -+// This code does not preserve the rax, rcx, rdx, rsi, rdi or r8-r11 -+// registers or the artihmetic status flags. However, the rbx, rbp and -+// r12-r15 registers are preserved across calls. -+ -+// void md5_transform_CPUbyteorder(u_int32_t *hash, u_int32_t *in) -+ -+#if defined(USE_UNDERLINE) -+# define md5_transform_CPUbyteorder _md5_transform_CPUbyteorder -+#endif -+#if !defined(ALIGN64BYTES) -+# define ALIGN64BYTES 64 -+#endif -+ -+ .file "md5-amd64.S" -+ .globl md5_transform_CPUbyteorder -+ -+// rdi = pointer to hash[4] array which is read and written -+// rsi = pointer to in[16] array which is read only -+ -+ .text -+ .align ALIGN64BYTES -+md5_transform_CPUbyteorder: -+ movl 12(%rdi),%eax -+ movl 8(%rdi),%ecx -+ movl (%rdi),%r8d -+ movl 4(%rdi),%r9d -+ movl (%rsi),%r10d -+ prefetcht0 60(%rsi) -+ movl %eax,%edx -+ xorl %ecx,%eax -+ -+#define REPEAT1(p1w,p2x,p3z,p4c,p5s,p6Nin,p7Nz,p8Ny) \ -+ addl $p4c,p1w ;\ -+ andl p2x,%eax ;\ -+ addl %r10d,p1w ;\ -+ xorl p3z,%eax ;\ -+ movl p6Nin*4(%rsi),%r10d ;\ -+ addl %eax,p1w ;\ -+ movl p7Nz,%eax ;\ -+ roll $p5s,p1w ;\ -+ xorl p8Ny,%eax ;\ -+ addl p2x,p1w -+ -+ REPEAT1(%r8d,%r9d,%edx,0xd76aa478, 7, 1,%ecx,%r9d) -+ REPEAT1(%edx,%r8d,%ecx,0xe8c7b756,12, 2,%r9d,%r8d) -+ REPEAT1(%ecx,%edx,%r9d,0x242070db,17, 3,%r8d,%edx) -+ REPEAT1(%r9d,%ecx,%r8d,0xc1bdceee,22, 4,%edx,%ecx) -+ REPEAT1(%r8d,%r9d,%edx,0xf57c0faf, 7, 5,%ecx,%r9d) -+ REPEAT1(%edx,%r8d,%ecx,0x4787c62a,12, 6,%r9d,%r8d) -+ REPEAT1(%ecx,%edx,%r9d,0xa8304613,17, 7,%r8d,%edx) -+ REPEAT1(%r9d,%ecx,%r8d,0xfd469501,22, 8,%edx,%ecx) -+ REPEAT1(%r8d,%r9d,%edx,0x698098d8, 7, 9,%ecx,%r9d) -+ REPEAT1(%edx,%r8d,%ecx,0x8b44f7af,12,10,%r9d,%r8d) -+ REPEAT1(%ecx,%edx,%r9d,0xffff5bb1,17,11,%r8d,%edx) -+ REPEAT1(%r9d,%ecx,%r8d,0x895cd7be,22,12,%edx,%ecx) -+ REPEAT1(%r8d,%r9d,%edx,0x6b901122, 7,13,%ecx,%r9d) -+ REPEAT1(%edx,%r8d,%ecx,0xfd987193,12,14,%r9d,%r8d) -+ REPEAT1(%ecx,%edx,%r9d,0xa679438e,17,15,%r8d,%edx) -+ -+ addl $0x49b40821,%r9d -+ andl %ecx,%eax -+ addl %r10d,%r9d -+ xorl %r8d,%eax -+ movl 1*4(%rsi),%r10d -+ addl %eax,%r9d -+ movl %ecx,%eax -+ roll $22,%r9d -+ addl %ecx,%r9d -+ -+#define REPEAT2(p1w,p2x,p3y,p4z,p5c,p6s,p7Nin,p8Ny) \ -+ xorl p2x,%eax ;\ -+ addl $p5c,p1w ;\ -+ andl p4z,%eax ;\ -+ addl %r10d,p1w ;\ -+ xorl p3y,%eax ;\ -+ movl p7Nin*4(%rsi),%r10d ;\ -+ addl %eax,p1w ;\ -+ movl p8Ny,%eax ;\ -+ roll $p6s,p1w ;\ -+ addl p2x,p1w -+ -+ REPEAT2(%r8d,%r9d,%ecx,%edx,0xf61e2562, 5, 6,%r9d) -+ REPEAT2(%edx,%r8d,%r9d,%ecx,0xc040b340, 9,11,%r8d) -+ REPEAT2(%ecx,%edx,%r8d,%r9d,0x265e5a51,14, 0,%edx) -+ REPEAT2(%r9d,%ecx,%edx,%r8d,0xe9b6c7aa,20, 5,%ecx) -+ REPEAT2(%r8d,%r9d,%ecx,%edx,0xd62f105d, 5,10,%r9d) -+ REPEAT2(%edx,%r8d,%r9d,%ecx,0x02441453, 9,15,%r8d) -+ REPEAT2(%ecx,%edx,%r8d,%r9d,0xd8a1e681,14, 4,%edx) -+ REPEAT2(%r9d,%ecx,%edx,%r8d,0xe7d3fbc8,20, 9,%ecx) -+ REPEAT2(%r8d,%r9d,%ecx,%edx,0x21e1cde6, 5,14,%r9d) -+ REPEAT2(%edx,%r8d,%r9d,%ecx,0xc33707d6, 9, 3,%r8d) -+ REPEAT2(%ecx,%edx,%r8d,%r9d,0xf4d50d87,14, 8,%edx) -+ REPEAT2(%r9d,%ecx,%edx,%r8d,0x455a14ed,20,13,%ecx) -+ REPEAT2(%r8d,%r9d,%ecx,%edx,0xa9e3e905, 5, 2,%r9d) -+ REPEAT2(%edx,%r8d,%r9d,%ecx,0xfcefa3f8, 9, 7,%r8d) -+ REPEAT2(%ecx,%edx,%r8d,%r9d,0x676f02d9,14,12,%edx) -+ -+ xorl %ecx,%eax -+ addl $0x8d2a4c8a,%r9d -+ andl %r8d,%eax -+ addl %r10d,%r9d -+ xorl %edx,%eax -+ movl 5*4(%rsi),%r10d -+ addl %eax,%r9d -+ movl %ecx,%eax -+ roll $20,%r9d -+ xorl %edx,%eax -+ addl %ecx,%r9d -+ -+#define REPEAT3(p1w,p2x,p3c,p4s,p5Nin,p6Ny,p7Nz) \ -+ addl $p3c,p1w ;\ -+ xorl p2x,%eax ;\ -+ addl %r10d,p1w ;\ -+ movl p5Nin*4(%rsi),%r10d ;\ -+ addl %eax,p1w ;\ -+ movl p6Ny,%eax ;\ -+ roll $p4s,p1w ;\ -+ xorl p7Nz,%eax ;\ -+ addl p2x,p1w -+ -+ REPEAT3(%r8d,%r9d,0xfffa3942, 4, 8,%r9d,%ecx) -+ REPEAT3(%edx,%r8d,0x8771f681,11,11,%r8d,%r9d) -+ REPEAT3(%ecx,%edx,0x6d9d6122,16,14,%edx,%r8d) -+ REPEAT3(%r9d,%ecx,0xfde5380c,23, 1,%ecx,%edx) -+ REPEAT3(%r8d,%r9d,0xa4beea44, 4, 4,%r9d,%ecx) -+ REPEAT3(%edx,%r8d,0x4bdecfa9,11, 7,%r8d,%r9d) -+ REPEAT3(%ecx,%edx,0xf6bb4b60,16,10,%edx,%r8d) -+ REPEAT3(%r9d,%ecx,0xbebfbc70,23,13,%ecx,%edx) -+ REPEAT3(%r8d,%r9d,0x289b7ec6, 4, 0,%r9d,%ecx) -+ REPEAT3(%edx,%r8d,0xeaa127fa,11, 3,%r8d,%r9d) -+ REPEAT3(%ecx,%edx,0xd4ef3085,16, 6,%edx,%r8d) -+ REPEAT3(%r9d,%ecx,0x04881d05,23, 9,%ecx,%edx) -+ REPEAT3(%r8d,%r9d,0xd9d4d039, 4,12,%r9d,%ecx) -+ REPEAT3(%edx,%r8d,0xe6db99e5,11,15,%r8d,%r9d) -+ REPEAT3(%ecx,%edx,0x1fa27cf8,16, 2,%edx,%r8d) -+ -+ addl $0xc4ac5665,%r9d -+ xorl %ecx,%eax -+ addl %r10d,%r9d -+ movl (%rsi),%r10d -+ addl %eax,%r9d -+ movl %edx,%eax -+ roll $23,%r9d -+ notl %eax -+ addl %ecx,%r9d -+ -+#define REPEAT4(p1w,p2x,p3y,p4c,p5s,p6Nin,p7Nz) \ -+ addl $p4c,p1w ;\ -+ orl p2x,%eax ;\ -+ addl %r10d,p1w ;\ -+ xorl p3y,%eax ;\ -+ movl p6Nin*4(%rsi),%r10d ;\ -+ addl %eax,p1w ;\ -+ movl p7Nz,%eax ;\ -+ roll $p5s,p1w ;\ -+ notl %eax ;\ -+ addl p2x,p1w -+ -+ REPEAT4(%r8d,%r9d,%ecx,0xf4292244, 6, 7,%ecx) -+ REPEAT4(%edx,%r8d,%r9d,0x432aff97,10,14,%r9d) -+ REPEAT4(%ecx,%edx,%r8d,0xab9423a7,15, 5,%r8d) -+ REPEAT4(%r9d,%ecx,%edx,0xfc93a039,21,12,%edx) -+ REPEAT4(%r8d,%r9d,%ecx,0x655b59c3, 6, 3,%ecx) -+ REPEAT4(%edx,%r8d,%r9d,0x8f0ccc92,10,10,%r9d) -+ REPEAT4(%ecx,%edx,%r8d,0xffeff47d,15, 1,%r8d) -+ REPEAT4(%r9d,%ecx,%edx,0x85845dd1,21, 8,%edx) -+ REPEAT4(%r8d,%r9d,%ecx,0x6fa87e4f, 6,15,%ecx) -+ REPEAT4(%edx,%r8d,%r9d,0xfe2ce6e0,10, 6,%r9d) -+ REPEAT4(%ecx,%edx,%r8d,0xa3014314,15,13,%r8d) -+ REPEAT4(%r9d,%ecx,%edx,0x4e0811a1,21, 4,%edx) -+ REPEAT4(%r8d,%r9d,%ecx,0xf7537e82, 6,11,%ecx) -+ REPEAT4(%edx,%r8d,%r9d,0xbd3af235,10, 2,%r9d) -+ REPEAT4(%ecx,%edx,%r8d,0x2ad7d2bb,15, 9,%r8d) -+ -+ addl $0xeb86d391,%r9d -+ orl %ecx,%eax -+ addl %r10d,%r9d -+ xorl %edx,%eax -+ addl %eax,%r9d -+ roll $21,%r9d -+ addl %ecx,%r9d -+ -+ addl %r8d,(%rdi) -+ addl %r9d,4(%rdi) -+ addl %ecx,8(%rdi) -+ addl %edx,12(%rdi) -+ ret -+ -+#if defined(__ELF__) && defined(SECTION_NOTE_GNU_STACK) -+ .section .note.GNU-stack,"",@progbits -+#endif -diff -urN linux-3.10-noloop/drivers/misc/md5-x86.S linux-3.10-AES/drivers/misc/md5-x86.S ---- linux-3.10-noloop/drivers/misc/md5-x86.S 1970-01-01 02:00:00.000000000 +0200 -+++ linux-3.10-AES/drivers/misc/md5-x86.S 2013-07-01 16:12:48.000000000 +0300 -@@ -0,0 +1,211 @@ -+// -+// md5-x86.S -+// -+// Written by Jari Ruusu, October 1 2003 -+// -+// Copyright 2003 by Jari Ruusu. -+// Redistribution of this file is permitted under the GNU Public License. -+// -+ -+// A MD5 transform implementation for x86 compatible processors. This -+// version uses i386 instruction set but instruction scheduling is optimized -+// for Pentium-2. This code does not preserve the eax, ecx or edx registers -+// or the artihmetic status flags. However, the ebx, esi, edi, and ebp -+// registers are preserved across calls. -+ -+// void md5_transform_CPUbyteorder(u_int32_t *hash, u_int32_t *in) -+ -+#if defined(USE_UNDERLINE) -+# define md5_transform_CPUbyteorder _md5_transform_CPUbyteorder -+#endif -+#if !defined(ALIGN32BYTES) -+# define ALIGN32BYTES 32 -+#endif -+ -+ .file "md5-x86.S" -+ .globl md5_transform_CPUbyteorder -+ .text -+ .align ALIGN32BYTES -+ -+md5_transform_CPUbyteorder: -+ push %ebp -+ mov 4+4(%esp),%eax // pointer to 'hash' input -+ mov 8+4(%esp),%ebp // pointer to 'in' array -+ push %ebx -+ push %esi -+ push %edi -+ -+ mov (%eax),%esi -+ mov 4(%eax),%edi -+ mov 8(%eax),%ecx -+ mov 12(%eax),%eax -+ mov (%ebp),%ebx -+ mov %eax,%edx -+ xor %ecx,%eax -+ -+#define REPEAT1(p1w,p2x,p3z,p4c,p5s,p6Nin,p7Nz,p8Ny) \ -+ add $p4c,p1w ;\ -+ and p2x,%eax ;\ -+ add %ebx,p1w ;\ -+ xor p3z,%eax ;\ -+ mov p6Nin*4(%ebp),%ebx ;\ -+ add %eax,p1w ;\ -+ mov p7Nz,%eax ;\ -+ rol $p5s,p1w ;\ -+ xor p8Ny,%eax ;\ -+ add p2x,p1w -+ -+ REPEAT1(%esi,%edi,%edx,0xd76aa478, 7, 1,%ecx,%edi) -+ REPEAT1(%edx,%esi,%ecx,0xe8c7b756,12, 2,%edi,%esi) -+ REPEAT1(%ecx,%edx,%edi,0x242070db,17, 3,%esi,%edx) -+ REPEAT1(%edi,%ecx,%esi,0xc1bdceee,22, 4,%edx,%ecx) -+ REPEAT1(%esi,%edi,%edx,0xf57c0faf, 7, 5,%ecx,%edi) -+ REPEAT1(%edx,%esi,%ecx,0x4787c62a,12, 6,%edi,%esi) -+ REPEAT1(%ecx,%edx,%edi,0xa8304613,17, 7,%esi,%edx) -+ REPEAT1(%edi,%ecx,%esi,0xfd469501,22, 8,%edx,%ecx) -+ REPEAT1(%esi,%edi,%edx,0x698098d8, 7, 9,%ecx,%edi) -+ REPEAT1(%edx,%esi,%ecx,0x8b44f7af,12,10,%edi,%esi) -+ REPEAT1(%ecx,%edx,%edi,0xffff5bb1,17,11,%esi,%edx) -+ REPEAT1(%edi,%ecx,%esi,0x895cd7be,22,12,%edx,%ecx) -+ REPEAT1(%esi,%edi,%edx,0x6b901122, 7,13,%ecx,%edi) -+ REPEAT1(%edx,%esi,%ecx,0xfd987193,12,14,%edi,%esi) -+ REPEAT1(%ecx,%edx,%edi,0xa679438e,17,15,%esi,%edx) -+ -+ add $0x49b40821,%edi -+ and %ecx,%eax -+ add %ebx,%edi -+ xor %esi,%eax -+ mov 1*4(%ebp),%ebx -+ add %eax,%edi -+ mov %ecx,%eax -+ rol $22,%edi -+ add %ecx,%edi -+ -+#define REPEAT2(p1w,p2x,p3y,p4z,p5c,p6s,p7Nin,p8Ny) \ -+ xor p2x,%eax ;\ -+ add $p5c,p1w ;\ -+ and p4z,%eax ;\ -+ add %ebx,p1w ;\ -+ xor p3y,%eax ;\ -+ mov p7Nin*4(%ebp),%ebx ;\ -+ add %eax,p1w ;\ -+ mov p8Ny,%eax ;\ -+ rol $p6s,p1w ;\ -+ add p2x,p1w -+ -+ REPEAT2(%esi,%edi,%ecx,%edx,0xf61e2562, 5, 6,%edi) -+ REPEAT2(%edx,%esi,%edi,%ecx,0xc040b340, 9,11,%esi) -+ REPEAT2(%ecx,%edx,%esi,%edi,0x265e5a51,14, 0,%edx) -+ REPEAT2(%edi,%ecx,%edx,%esi,0xe9b6c7aa,20, 5,%ecx) -+ REPEAT2(%esi,%edi,%ecx,%edx,0xd62f105d, 5,10,%edi) -+ REPEAT2(%edx,%esi,%edi,%ecx,0x02441453, 9,15,%esi) -+ REPEAT2(%ecx,%edx,%esi,%edi,0xd8a1e681,14, 4,%edx) -+ REPEAT2(%edi,%ecx,%edx,%esi,0xe7d3fbc8,20, 9,%ecx) -+ REPEAT2(%esi,%edi,%ecx,%edx,0x21e1cde6, 5,14,%edi) -+ REPEAT2(%edx,%esi,%edi,%ecx,0xc33707d6, 9, 3,%esi) -+ REPEAT2(%ecx,%edx,%esi,%edi,0xf4d50d87,14, 8,%edx) -+ REPEAT2(%edi,%ecx,%edx,%esi,0x455a14ed,20,13,%ecx) -+ REPEAT2(%esi,%edi,%ecx,%edx,0xa9e3e905, 5, 2,%edi) -+ REPEAT2(%edx,%esi,%edi,%ecx,0xfcefa3f8, 9, 7,%esi) -+ REPEAT2(%ecx,%edx,%esi,%edi,0x676f02d9,14,12,%edx) -+ -+ xor %ecx,%eax -+ add $0x8d2a4c8a,%edi -+ and %esi,%eax -+ add %ebx,%edi -+ xor %edx,%eax -+ mov 5*4(%ebp),%ebx -+ add %eax,%edi -+ mov %ecx,%eax -+ rol $20,%edi -+ xor %edx,%eax -+ add %ecx,%edi -+ -+#define REPEAT3(p1w,p2x,p3c,p4s,p5Nin,p6Ny,p7Nz) \ -+ add $p3c,p1w ;\ -+ xor p2x,%eax ;\ -+ add %ebx,p1w ;\ -+ mov p5Nin*4(%ebp),%ebx ;\ -+ add %eax,p1w ;\ -+ mov p6Ny,%eax ;\ -+ rol $p4s,p1w ;\ -+ xor p7Nz,%eax ;\ -+ add p2x,p1w -+ -+ REPEAT3(%esi,%edi,0xfffa3942, 4, 8,%edi,%ecx) -+ REPEAT3(%edx,%esi,0x8771f681,11,11,%esi,%edi) -+ REPEAT3(%ecx,%edx,0x6d9d6122,16,14,%edx,%esi) -+ REPEAT3(%edi,%ecx,0xfde5380c,23, 1,%ecx,%edx) -+ REPEAT3(%esi,%edi,0xa4beea44, 4, 4,%edi,%ecx) -+ REPEAT3(%edx,%esi,0x4bdecfa9,11, 7,%esi,%edi) -+ REPEAT3(%ecx,%edx,0xf6bb4b60,16,10,%edx,%esi) -+ REPEAT3(%edi,%ecx,0xbebfbc70,23,13,%ecx,%edx) -+ REPEAT3(%esi,%edi,0x289b7ec6, 4, 0,%edi,%ecx) -+ REPEAT3(%edx,%esi,0xeaa127fa,11, 3,%esi,%edi) -+ REPEAT3(%ecx,%edx,0xd4ef3085,16, 6,%edx,%esi) -+ REPEAT3(%edi,%ecx,0x04881d05,23, 9,%ecx,%edx) -+ REPEAT3(%esi,%edi,0xd9d4d039, 4,12,%edi,%ecx) -+ REPEAT3(%edx,%esi,0xe6db99e5,11,15,%esi,%edi) -+ REPEAT3(%ecx,%edx,0x1fa27cf8,16, 2,%edx,%esi) -+ -+ add $0xc4ac5665,%edi -+ xor %ecx,%eax -+ add %ebx,%edi -+ mov (%ebp),%ebx -+ add %eax,%edi -+ mov %edx,%eax -+ rol $23,%edi -+ not %eax -+ add %ecx,%edi -+ -+#define REPEAT4(p1w,p2x,p3y,p4c,p5s,p6Nin,p7Nz) \ -+ add $p4c,p1w ;\ -+ or p2x,%eax ;\ -+ add %ebx,p1w ;\ -+ xor p3y,%eax ;\ -+ mov p6Nin*4(%ebp),%ebx ;\ -+ add %eax,p1w ;\ -+ mov p7Nz,%eax ;\ -+ rol $p5s,p1w ;\ -+ not %eax ;\ -+ add p2x,p1w -+ -+ REPEAT4(%esi,%edi,%ecx,0xf4292244, 6, 7,%ecx) -+ REPEAT4(%edx,%esi,%edi,0x432aff97,10,14,%edi) -+ REPEAT4(%ecx,%edx,%esi,0xab9423a7,15, 5,%esi) -+ REPEAT4(%edi,%ecx,%edx,0xfc93a039,21,12,%edx) -+ REPEAT4(%esi,%edi,%ecx,0x655b59c3, 6, 3,%ecx) -+ REPEAT4(%edx,%esi,%edi,0x8f0ccc92,10,10,%edi) -+ REPEAT4(%ecx,%edx,%esi,0xffeff47d,15, 1,%esi) -+ REPEAT4(%edi,%ecx,%edx,0x85845dd1,21, 8,%edx) -+ REPEAT4(%esi,%edi,%ecx,0x6fa87e4f, 6,15,%ecx) -+ REPEAT4(%edx,%esi,%edi,0xfe2ce6e0,10, 6,%edi) -+ REPEAT4(%ecx,%edx,%esi,0xa3014314,15,13,%esi) -+ REPEAT4(%edi,%ecx,%edx,0x4e0811a1,21, 4,%edx) -+ REPEAT4(%esi,%edi,%ecx,0xf7537e82, 6,11,%ecx) -+ REPEAT4(%edx,%esi,%edi,0xbd3af235,10, 2,%edi) -+ REPEAT4(%ecx,%edx,%esi,0x2ad7d2bb,15, 9,%esi) -+ -+ add $0xeb86d391,%edi -+ or %ecx,%eax -+ add %ebx,%edi -+ xor %edx,%eax -+ mov 4+16(%esp),%ebp // pointer to 'hash' output -+ add %eax,%edi -+ rol $21,%edi -+ add %ecx,%edi -+ -+ add %esi,(%ebp) -+ add %edi,4(%ebp) -+ add %ecx,8(%ebp) -+ add %edx,12(%ebp) -+ -+ pop %edi -+ pop %esi -+ pop %ebx -+ pop %ebp -+ ret -+ -+#if defined(__ELF__) && defined(SECTION_NOTE_GNU_STACK) -+ .section .note.GNU-stack,"",@progbits -+#endif -diff -urN linux-3.10-noloop/drivers/misc/md5.c linux-3.10-AES/drivers/misc/md5.c ---- linux-3.10-noloop/drivers/misc/md5.c 1970-01-01 02:00:00.000000000 +0200 -+++ linux-3.10-AES/drivers/misc/md5.c 2013-07-01 16:12:48.000000000 +0300 -@@ -0,0 +1,106 @@ -+/* -+ * MD5 Message Digest Algorithm (RFC1321). -+ * -+ * Derived from cryptoapi implementation, originally based on the -+ * public domain implementation written by Colin Plumb in 1993. -+ * -+ * Copyright (c) Cryptoapi developers. -+ * Copyright (c) 2002 James Morris -+ * -+ * This program is free software; you can redistribute it and/or modify it -+ * under the terms of the GNU General Public License as published by the Free -+ * Software Foundation; either version 2 of the License, or (at your option) -+ * any later version. -+ */ -+ -+#include "md5.h" -+ -+#define MD5_F1(x, y, z) (z ^ (x & (y ^ z))) -+#define MD5_F2(x, y, z) MD5_F1(z, x, y) -+#define MD5_F3(x, y, z) (x ^ y ^ z) -+#define MD5_F4(x, y, z) (y ^ (x | ~z)) -+#define MD5_STEP(f, w, x, y, z, in, s) \ -+ (w += f(x, y, z) + in, w = (w<>(32-s)) + x) -+ -+void md5_transform_CPUbyteorder(u_int32_t *hash, u_int32_t const *in) -+{ -+ u_int32_t a, b, c, d; -+ -+ a = hash[0]; -+ b = hash[1]; -+ c = hash[2]; -+ d = hash[3]; -+ -+ MD5_STEP(MD5_F1, a, b, c, d, in[0] + 0xd76aa478, 7); -+ MD5_STEP(MD5_F1, d, a, b, c, in[1] + 0xe8c7b756, 12); -+ MD5_STEP(MD5_F1, c, d, a, b, in[2] + 0x242070db, 17); -+ MD5_STEP(MD5_F1, b, c, d, a, in[3] + 0xc1bdceee, 22); -+ MD5_STEP(MD5_F1, a, b, c, d, in[4] + 0xf57c0faf, 7); -+ MD5_STEP(MD5_F1, d, a, b, c, in[5] + 0x4787c62a, 12); -+ MD5_STEP(MD5_F1, c, d, a, b, in[6] + 0xa8304613, 17); -+ MD5_STEP(MD5_F1, b, c, d, a, in[7] + 0xfd469501, 22); -+ MD5_STEP(MD5_F1, a, b, c, d, in[8] + 0x698098d8, 7); -+ MD5_STEP(MD5_F1, d, a, b, c, in[9] + 0x8b44f7af, 12); -+ MD5_STEP(MD5_F1, c, d, a, b, in[10] + 0xffff5bb1, 17); -+ MD5_STEP(MD5_F1, b, c, d, a, in[11] + 0x895cd7be, 22); -+ MD5_STEP(MD5_F1, a, b, c, d, in[12] + 0x6b901122, 7); -+ MD5_STEP(MD5_F1, d, a, b, c, in[13] + 0xfd987193, 12); -+ MD5_STEP(MD5_F1, c, d, a, b, in[14] + 0xa679438e, 17); -+ MD5_STEP(MD5_F1, b, c, d, a, in[15] + 0x49b40821, 22); -+ -+ MD5_STEP(MD5_F2, a, b, c, d, in[1] + 0xf61e2562, 5); -+ MD5_STEP(MD5_F2, d, a, b, c, in[6] + 0xc040b340, 9); -+ MD5_STEP(MD5_F2, c, d, a, b, in[11] + 0x265e5a51, 14); -+ MD5_STEP(MD5_F2, b, c, d, a, in[0] + 0xe9b6c7aa, 20); -+ MD5_STEP(MD5_F2, a, b, c, d, in[5] + 0xd62f105d, 5); -+ MD5_STEP(MD5_F2, d, a, b, c, in[10] + 0x02441453, 9); -+ MD5_STEP(MD5_F2, c, d, a, b, in[15] + 0xd8a1e681, 14); -+ MD5_STEP(MD5_F2, b, c, d, a, in[4] + 0xe7d3fbc8, 20); -+ MD5_STEP(MD5_F2, a, b, c, d, in[9] + 0x21e1cde6, 5); -+ MD5_STEP(MD5_F2, d, a, b, c, in[14] + 0xc33707d6, 9); -+ MD5_STEP(MD5_F2, c, d, a, b, in[3] + 0xf4d50d87, 14); -+ MD5_STEP(MD5_F2, b, c, d, a, in[8] + 0x455a14ed, 20); -+ MD5_STEP(MD5_F2, a, b, c, d, in[13] + 0xa9e3e905, 5); -+ MD5_STEP(MD5_F2, d, a, b, c, in[2] + 0xfcefa3f8, 9); -+ MD5_STEP(MD5_F2, c, d, a, b, in[7] + 0x676f02d9, 14); -+ MD5_STEP(MD5_F2, b, c, d, a, in[12] + 0x8d2a4c8a, 20); -+ -+ MD5_STEP(MD5_F3, a, b, c, d, in[5] + 0xfffa3942, 4); -+ MD5_STEP(MD5_F3, d, a, b, c, in[8] + 0x8771f681, 11); -+ MD5_STEP(MD5_F3, c, d, a, b, in[11] + 0x6d9d6122, 16); -+ MD5_STEP(MD5_F3, b, c, d, a, in[14] + 0xfde5380c, 23); -+ MD5_STEP(MD5_F3, a, b, c, d, in[1] + 0xa4beea44, 4); -+ MD5_STEP(MD5_F3, d, a, b, c, in[4] + 0x4bdecfa9, 11); -+ MD5_STEP(MD5_F3, c, d, a, b, in[7] + 0xf6bb4b60, 16); -+ MD5_STEP(MD5_F3, b, c, d, a, in[10] + 0xbebfbc70, 23); -+ MD5_STEP(MD5_F3, a, b, c, d, in[13] + 0x289b7ec6, 4); -+ MD5_STEP(MD5_F3, d, a, b, c, in[0] + 0xeaa127fa, 11); -+ MD5_STEP(MD5_F3, c, d, a, b, in[3] + 0xd4ef3085, 16); -+ MD5_STEP(MD5_F3, b, c, d, a, in[6] + 0x04881d05, 23); -+ MD5_STEP(MD5_F3, a, b, c, d, in[9] + 0xd9d4d039, 4); -+ MD5_STEP(MD5_F3, d, a, b, c, in[12] + 0xe6db99e5, 11); -+ MD5_STEP(MD5_F3, c, d, a, b, in[15] + 0x1fa27cf8, 16); -+ MD5_STEP(MD5_F3, b, c, d, a, in[2] + 0xc4ac5665, 23); -+ -+ MD5_STEP(MD5_F4, a, b, c, d, in[0] + 0xf4292244, 6); -+ MD5_STEP(MD5_F4, d, a, b, c, in[7] + 0x432aff97, 10); -+ MD5_STEP(MD5_F4, c, d, a, b, in[14] + 0xab9423a7, 15); -+ MD5_STEP(MD5_F4, b, c, d, a, in[5] + 0xfc93a039, 21); -+ MD5_STEP(MD5_F4, a, b, c, d, in[12] + 0x655b59c3, 6); -+ MD5_STEP(MD5_F4, d, a, b, c, in[3] + 0x8f0ccc92, 10); -+ MD5_STEP(MD5_F4, c, d, a, b, in[10] + 0xffeff47d, 15); -+ MD5_STEP(MD5_F4, b, c, d, a, in[1] + 0x85845dd1, 21); -+ MD5_STEP(MD5_F4, a, b, c, d, in[8] + 0x6fa87e4f, 6); -+ MD5_STEP(MD5_F4, d, a, b, c, in[15] + 0xfe2ce6e0, 10); -+ MD5_STEP(MD5_F4, c, d, a, b, in[6] + 0xa3014314, 15); -+ MD5_STEP(MD5_F4, b, c, d, a, in[13] + 0x4e0811a1, 21); -+ MD5_STEP(MD5_F4, a, b, c, d, in[4] + 0xf7537e82, 6); -+ MD5_STEP(MD5_F4, d, a, b, c, in[11] + 0xbd3af235, 10); -+ MD5_STEP(MD5_F4, c, d, a, b, in[2] + 0x2ad7d2bb, 15); -+ MD5_STEP(MD5_F4, b, c, d, a, in[9] + 0xeb86d391, 21); -+ -+ hash[0] += a; -+ hash[1] += b; -+ hash[2] += c; -+ hash[3] += d; -+} -diff -urN linux-3.10-noloop/drivers/misc/md5.h linux-3.10-AES/drivers/misc/md5.h ---- linux-3.10-noloop/drivers/misc/md5.h 1970-01-01 02:00:00.000000000 +0200 -+++ linux-3.10-AES/drivers/misc/md5.h 2013-07-01 16:12:48.000000000 +0300 -@@ -0,0 +1,15 @@ -+/* md5.h */ -+ -+#include -+#include -+#include -+ -+#if defined(CONFIG_X86) || defined(CONFIG_X86_64) -+ asmlinkage -+#endif -+extern void md5_transform_CPUbyteorder(u_int32_t *, u_int32_t const *); -+ -+#if defined(CONFIG_X86) || defined(CONFIG_X86_64) -+ asmlinkage -+#endif -+extern void md5_transform_CPUbyteorder_2x(u_int32_t *, u_int32_t const *, u_int32_t const *); -diff -urN linux-3.10-noloop/include/linux/loop.h linux-3.10-AES/include/linux/loop.h ---- linux-3.10-noloop/include/linux/loop.h 1970-01-01 02:00:00.000000000 +0200 -+++ linux-3.10-AES/include/linux/loop.h 2013-07-01 16:12:48.000000000 +0300 -@@ -0,0 +1,171 @@ -+#ifndef _LINUX_LOOP_H -+#define _LINUX_LOOP_H -+ -+/* -+ * include/linux/loop.h -+ * -+ * Written by Theodore Ts'o, 3/29/93. -+ * -+ * Copyright 1993 by Theodore Ts'o. Redistribution of this file is -+ * permitted under the GNU General Public License. -+ */ -+ -+#define LO_NAME_SIZE 64 -+#define LO_KEY_SIZE 32 -+ -+#ifdef __KERNEL__ -+#include -+#include -+#include -+#include -+#include -+ -+struct loop_func_table; -+ -+struct loop_device { -+ int lo_number; -+ int lo_refcnt; -+ loff_t lo_offset; -+ loff_t lo_sizelimit; -+ int lo_flags; -+ int (*transfer)(struct loop_device *, int cmd, -+ char *raw_buf, char *loop_buf, int size, -+ sector_t real_block); -+ struct loop_func_table *lo_encryption; -+ char lo_file_name[LO_NAME_SIZE]; -+ char lo_crypt_name[LO_NAME_SIZE]; -+ char lo_encrypt_key[LO_KEY_SIZE]; -+ int lo_encrypt_key_size; -+#if LINUX_VERSION_CODE >= 0x30600 -+ kuid_t lo_key_owner; /* Who set the key */ -+#else -+ uid_t lo_key_owner; /* Who set the key */ -+#endif -+ __u32 lo_init[2]; -+ int (*ioctl)(struct loop_device *, int cmd, -+ unsigned long arg); -+ -+ struct file * lo_backing_file; -+ struct block_device *lo_device; -+ void *key_data; -+ -+ int old_gfp_mask; -+ -+ spinlock_t lo_lock; -+ struct completion lo_done; -+ atomic_t lo_pending; -+ -+ struct request_queue *lo_queue; -+ -+ struct bio *lo_bio_que0; -+ struct bio *lo_bio_free0; -+ struct bio *lo_bio_free1; -+ int lo_bio_flshMax; -+ int lo_bio_flshCnt; -+ wait_queue_head_t lo_bio_wait; -+ wait_queue_head_t lo_buf_wait; -+ sector_t lo_offs_sec; -+ sector_t lo_iv_remove; -+ spinlock_t lo_ioctl_spin; -+ int lo_ioctl_busy; -+ wait_queue_head_t lo_ioctl_wait; -+ struct request_queue *lo_backingQueue; -+#ifdef CONFIG_BLK_DEV_LOOP_KEYSCRUB -+ void (*lo_keyscrub_fn)(void *); -+ void *lo_keyscrub_ptr; -+#endif -+}; -+ -+#endif /* __KERNEL__ */ -+ -+/* -+ * Loop flags -+ */ -+#define LO_FLAGS_DO_BMAP 1 -+#define LO_FLAGS_READ_ONLY 2 -+ -+#include /* for __kernel_old_dev_t */ -+#include /* for __u64 */ -+ -+/* Backwards compatibility version */ -+struct loop_info { -+ int lo_number; /* ioctl r/o */ -+ __kernel_old_dev_t lo_device; /* ioctl r/o */ -+ unsigned long lo_inode; /* ioctl r/o */ -+ __kernel_old_dev_t lo_rdevice; /* ioctl r/o */ -+ int lo_offset; -+ int lo_encrypt_type; -+ int lo_encrypt_key_size; /* ioctl w/o */ -+ int lo_flags; /* ioctl r/o */ -+ char lo_name[LO_NAME_SIZE]; -+ unsigned char lo_encrypt_key[LO_KEY_SIZE]; /* ioctl w/o */ -+ unsigned long lo_init[2]; -+ char reserved[4]; -+}; -+ -+struct loop_info64 { -+ __u64 lo_device; /* ioctl r/o */ -+ __u64 lo_inode; /* ioctl r/o */ -+ __u64 lo_rdevice; /* ioctl r/o */ -+ __u64 lo_offset; -+ __u64 lo_sizelimit;/* bytes, 0 == max available */ -+ __u32 lo_number; /* ioctl r/o */ -+ __u32 lo_encrypt_type; -+ __u32 lo_encrypt_key_size; /* ioctl w/o */ -+ __u32 lo_flags; /* ioctl r/o */ -+ __u8 lo_file_name[LO_NAME_SIZE]; -+ __u8 lo_crypt_name[LO_NAME_SIZE]; -+ __u8 lo_encrypt_key[LO_KEY_SIZE]; /* ioctl w/o */ -+ __u64 lo_init[2]; -+}; -+ -+/* -+ * Loop filter types -+ */ -+ -+#define LO_CRYPT_NONE 0 -+#define LO_CRYPT_XOR 1 -+#define LO_CRYPT_DES 2 -+#define LO_CRYPT_FISH2 3 /* Twofish encryption */ -+#define LO_CRYPT_BLOW 4 -+#define LO_CRYPT_CAST128 5 -+#define LO_CRYPT_IDEA 6 -+#define LO_CRYPT_DUMMY 9 -+#define LO_CRYPT_SKIPJACK 10 -+#define LO_CRYPT_AES 16 -+#define LO_CRYPT_CRYPTOAPI 18 -+#define MAX_LO_CRYPT 20 -+ -+#ifdef __KERNEL__ -+/* Support for loadable transfer modules */ -+struct loop_func_table { -+ int number; /* filter type */ -+ int (*transfer)(struct loop_device *lo, int cmd, char *raw_buf, -+ char *loop_buf, int size, sector_t real_block); -+ int (*init)(struct loop_device *, struct loop_info64 *); -+ /* release is called from loop_unregister_transfer or clr_fd */ -+ int (*release)(struct loop_device *); -+ int (*ioctl)(struct loop_device *, int cmd, unsigned long arg); -+ struct module *owner; -+}; -+ -+int loop_register_transfer(struct loop_func_table *funcs); -+int loop_unregister_transfer(int number); -+ -+#endif -+/* -+ * IOCTL commands --- we will commandeer 0x4C ('L') -+ */ -+ -+#define LOOP_SET_FD 0x4C00 -+#define LOOP_CLR_FD 0x4C01 -+#define LOOP_SET_STATUS 0x4C02 -+#define LOOP_GET_STATUS 0x4C03 -+#define LOOP_SET_STATUS64 0x4C04 -+#define LOOP_GET_STATUS64 0x4C05 -+#define LOOP_CHANGE_FD 0x4C06 -+ -+#define LOOP_MULTI_KEY_SETUP 0x4C4D -+#define LOOP_MULTI_KEY_SETUP_V3 0x4C4E -+#define LOOP_RECOMPUTE_DEV_SIZE 0x4C52 -+#endif