From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) by finch.gentoo.org (Postfix) with ESMTP id 2031B13877A for ; Tue, 19 Aug 2014 12:18:06 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 0666EE0AA9; Tue, 19 Aug 2014 12:17:44 +0000 (UTC) Received: from smtp.gentoo.org (smtp.gentoo.org [140.211.166.183]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id E1B0BE09B8 for ; Tue, 19 Aug 2014 12:17:29 +0000 (UTC) Received: from oystercatcher.gentoo.org (oystercatcher.gentoo.org [148.251.78.52]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by smtp.gentoo.org (Postfix) with ESMTPS id A354C34035E for ; Tue, 19 Aug 2014 12:17:28 +0000 (UTC) Received: from localhost.localdomain (localhost [127.0.0.1]) by oystercatcher.gentoo.org (Postfix) with ESMTP id 151793C7E for ; Tue, 19 Aug 2014 11:44:49 +0000 (UTC) From: "Mike Pagano" To: gentoo-commits@lists.gentoo.org Content-Transfer-Encoding: 8bit Content-type: text/plain; charset=UTF-8 Reply-To: gentoo-dev@lists.gentoo.org, "Mike Pagano" Message-ID: <1407522621.3c7969e2225d5436ff5859c14e57e56af1868bb7.mpagano@gentoo> Subject: [gentoo-commits] proj/linux-patches:3.14 commit in: / X-VCS-Repository: proj/linux-patches X-VCS-Files: 0000_README 1015_linux-3.14.16.patch X-VCS-Directories: / X-VCS-Committer: mpagano X-VCS-Committer-Name: Mike Pagano X-VCS-Revision: 3c7969e2225d5436ff5859c14e57e56af1868bb7 X-VCS-Branch: 3.14 Date: Tue, 19 Aug 2014 11:44:49 +0000 (UTC) Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-commits@lists.gentoo.org X-Archives-Salt: e8f32a5b-2a41-475d-b05d-8fef8fd3e203 X-Archives-Hash: 301dbdff395ec2a7ec2a1cece4beb726 commit: 3c7969e2225d5436ff5859c14e57e56af1868bb7 Author: Mike Pagano gentoo org> AuthorDate: Fri Aug 8 18:30:21 2014 +0000 Commit: Mike Pagano gentoo org> CommitDate: Fri Aug 8 18:30:21 2014 +0000 URL: http://sources.gentoo.org/gitweb/?p=proj/linux-patches.git;a=commit;h=3c7969e2 Linux patch 3.14.16 --- 0000_README | 4 + 1015_linux-3.14.16.patch | 1740 ++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 1744 insertions(+) diff --git a/0000_README b/0000_README index 70e968d..75c60df 100644 --- a/0000_README +++ b/0000_README @@ -102,6 +102,10 @@ Patch: 1014_linux-3.14.15.patch From: http://www.kernel.org Desc: Linux 3.14.15 +Patch: 1015_linux-3.14.16.patch +From: http://www.kernel.org +Desc: Linux 3.14.16 + Patch: 1500_XATTR_USER_PREFIX.patch From: https://bugs.gentoo.org/show_bug.cgi?id=470644 Desc: Support for namespace user.pax.* on tmpfs. diff --git a/1015_linux-3.14.16.patch b/1015_linux-3.14.16.patch new file mode 100644 index 0000000..346b103 --- /dev/null +++ b/1015_linux-3.14.16.patch @@ -0,0 +1,1740 @@ +diff --git a/Documentation/x86/x86_64/mm.txt b/Documentation/x86/x86_64/mm.txt +index c584a51add15..afe68ddbe6a4 100644 +--- a/Documentation/x86/x86_64/mm.txt ++++ b/Documentation/x86/x86_64/mm.txt +@@ -12,6 +12,8 @@ ffffc90000000000 - ffffe8ffffffffff (=45 bits) vmalloc/ioremap space + ffffe90000000000 - ffffe9ffffffffff (=40 bits) hole + ffffea0000000000 - ffffeaffffffffff (=40 bits) virtual memory map (1TB) + ... unused hole ... ++ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks ++... unused hole ... + ffffffff80000000 - ffffffffa0000000 (=512 MB) kernel text mapping, from phys 0 + ffffffffa0000000 - ffffffffff5fffff (=1525 MB) module mapping space + ffffffffff600000 - ffffffffffdfffff (=8 MB) vsyscalls +diff --git a/Makefile b/Makefile +index 188523e9e880..8b22e24a2d8e 100644 +--- a/Makefile ++++ b/Makefile +@@ -1,6 +1,6 @@ + VERSION = 3 + PATCHLEVEL = 14 +-SUBLEVEL = 15 ++SUBLEVEL = 16 + EXTRAVERSION = + NAME = Remembering Coco + +diff --git a/arch/arm/boot/dts/dra7-evm.dts b/arch/arm/boot/dts/dra7-evm.dts +index 5babba0a3a75..904dcf5973f3 100644 +--- a/arch/arm/boot/dts/dra7-evm.dts ++++ b/arch/arm/boot/dts/dra7-evm.dts +@@ -182,6 +182,7 @@ + regulator-name = "ldo3"; + regulator-min-microvolt = <1800000>; + regulator-max-microvolt = <1800000>; ++ regulator-always-on; + regulator-boot-on; + }; + +diff --git a/arch/arm/boot/dts/hi3620.dtsi b/arch/arm/boot/dts/hi3620.dtsi +index ab1116d086be..83a5b8685bd9 100644 +--- a/arch/arm/boot/dts/hi3620.dtsi ++++ b/arch/arm/boot/dts/hi3620.dtsi +@@ -73,7 +73,7 @@ + + L2: l2-cache { + compatible = "arm,pl310-cache"; +- reg = <0xfc10000 0x100000>; ++ reg = <0x100000 0x100000>; + interrupts = <0 15 4>; + cache-unified; + cache-level = <2>; +diff --git a/arch/arm/crypto/aesbs-glue.c b/arch/arm/crypto/aesbs-glue.c +index 4522366da759..15468fbbdea3 100644 +--- a/arch/arm/crypto/aesbs-glue.c ++++ b/arch/arm/crypto/aesbs-glue.c +@@ -137,7 +137,7 @@ static int aesbs_cbc_encrypt(struct blkcipher_desc *desc, + dst += AES_BLOCK_SIZE; + } while (--blocks); + } +- err = blkcipher_walk_done(desc, &walk, 0); ++ err = blkcipher_walk_done(desc, &walk, walk.nbytes % AES_BLOCK_SIZE); + } + return err; + } +@@ -158,7 +158,7 @@ static int aesbs_cbc_decrypt(struct blkcipher_desc *desc, + bsaes_cbc_encrypt(walk.src.virt.addr, walk.dst.virt.addr, + walk.nbytes, &ctx->dec, walk.iv); + kernel_neon_end(); +- err = blkcipher_walk_done(desc, &walk, 0); ++ err = blkcipher_walk_done(desc, &walk, walk.nbytes % AES_BLOCK_SIZE); + } + while (walk.nbytes) { + u32 blocks = walk.nbytes / AES_BLOCK_SIZE; +@@ -182,7 +182,7 @@ static int aesbs_cbc_decrypt(struct blkcipher_desc *desc, + dst += AES_BLOCK_SIZE; + src += AES_BLOCK_SIZE; + } while (--blocks); +- err = blkcipher_walk_done(desc, &walk, 0); ++ err = blkcipher_walk_done(desc, &walk, walk.nbytes % AES_BLOCK_SIZE); + } + return err; + } +@@ -268,7 +268,7 @@ static int aesbs_xts_encrypt(struct blkcipher_desc *desc, + bsaes_xts_encrypt(walk.src.virt.addr, walk.dst.virt.addr, + walk.nbytes, &ctx->enc, walk.iv); + kernel_neon_end(); +- err = blkcipher_walk_done(desc, &walk, 0); ++ err = blkcipher_walk_done(desc, &walk, walk.nbytes % AES_BLOCK_SIZE); + } + return err; + } +@@ -292,7 +292,7 @@ static int aesbs_xts_decrypt(struct blkcipher_desc *desc, + bsaes_xts_decrypt(walk.src.virt.addr, walk.dst.virt.addr, + walk.nbytes, &ctx->dec, walk.iv); + kernel_neon_end(); +- err = blkcipher_walk_done(desc, &walk, 0); ++ err = blkcipher_walk_done(desc, &walk, walk.nbytes % AES_BLOCK_SIZE); + } + return err; + } +diff --git a/arch/arm/mm/idmap.c b/arch/arm/mm/idmap.c +index 8e0e52eb76b5..d7a0ee898d24 100644 +--- a/arch/arm/mm/idmap.c ++++ b/arch/arm/mm/idmap.c +@@ -25,6 +25,13 @@ static void idmap_add_pmd(pud_t *pud, unsigned long addr, unsigned long end, + pr_warning("Failed to allocate identity pmd.\n"); + return; + } ++ /* ++ * Copy the original PMD to ensure that the PMD entries for ++ * the kernel image are preserved. ++ */ ++ if (!pud_none(*pud)) ++ memcpy(pmd, pmd_offset(pud, 0), ++ PTRS_PER_PMD * sizeof(pmd_t)); + pud_populate(&init_mm, pud, pmd); + pmd += pmd_index(addr); + } else +diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c +index b68c6b22e1c8..f15c22e8bcd5 100644 +--- a/arch/arm/mm/mmu.c ++++ b/arch/arm/mm/mmu.c +@@ -1436,8 +1436,8 @@ void __init early_paging_init(const struct machine_desc *mdesc, + return; + + /* remap kernel code and data */ +- map_start = init_mm.start_code; +- map_end = init_mm.brk; ++ map_start = init_mm.start_code & PMD_MASK; ++ map_end = ALIGN(init_mm.brk, PMD_SIZE); + + /* get a handle on things... */ + pgd0 = pgd_offset_k(0); +@@ -1472,7 +1472,7 @@ void __init early_paging_init(const struct machine_desc *mdesc, + } + + /* remap pmds for kernel mapping */ +- phys = __pa(map_start) & PMD_MASK; ++ phys = __pa(map_start); + do { + *pmdk++ = __pmd(phys | pmdprot); + phys += PMD_SIZE; +diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig +index 7324107acb40..c718d9f25900 100644 +--- a/arch/x86/Kconfig ++++ b/arch/x86/Kconfig +@@ -966,10 +966,27 @@ config VM86 + default y + depends on X86_32 + ---help--- +- This option is required by programs like DOSEMU to run 16-bit legacy +- code on X86 processors. It also may be needed by software like +- XFree86 to initialize some video cards via BIOS. Disabling this +- option saves about 6k. ++ This option is required by programs like DOSEMU to run ++ 16-bit real mode legacy code on x86 processors. It also may ++ be needed by software like XFree86 to initialize some video ++ cards via BIOS. Disabling this option saves about 6K. ++ ++config X86_16BIT ++ bool "Enable support for 16-bit segments" if EXPERT ++ default y ++ ---help--- ++ This option is required by programs like Wine to run 16-bit ++ protected mode legacy code on x86 processors. Disabling ++ this option saves about 300 bytes on i386, or around 6K text ++ plus 16K runtime memory on x86-64, ++ ++config X86_ESPFIX32 ++ def_bool y ++ depends on X86_16BIT && X86_32 ++ ++config X86_ESPFIX64 ++ def_bool y ++ depends on X86_16BIT && X86_64 + + config TOSHIBA + tristate "Toshiba Laptop support" +diff --git a/arch/x86/include/asm/espfix.h b/arch/x86/include/asm/espfix.h +new file mode 100644 +index 000000000000..99efebb2f69d +--- /dev/null ++++ b/arch/x86/include/asm/espfix.h +@@ -0,0 +1,16 @@ ++#ifndef _ASM_X86_ESPFIX_H ++#define _ASM_X86_ESPFIX_H ++ ++#ifdef CONFIG_X86_64 ++ ++#include ++ ++DECLARE_PER_CPU_READ_MOSTLY(unsigned long, espfix_stack); ++DECLARE_PER_CPU_READ_MOSTLY(unsigned long, espfix_waddr); ++ ++extern void init_espfix_bsp(void); ++extern void init_espfix_ap(void); ++ ++#endif /* CONFIG_X86_64 */ ++ ++#endif /* _ASM_X86_ESPFIX_H */ +diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h +index bba3cf88e624..0a8b519226b8 100644 +--- a/arch/x86/include/asm/irqflags.h ++++ b/arch/x86/include/asm/irqflags.h +@@ -129,7 +129,7 @@ static inline notrace unsigned long arch_local_irq_save(void) + + #define PARAVIRT_ADJUST_EXCEPTION_FRAME /* */ + +-#define INTERRUPT_RETURN iretq ++#define INTERRUPT_RETURN jmp native_iret + #define USERGS_SYSRET64 \ + swapgs; \ + sysretq; +diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h +index c883bf726398..7166e25ecb57 100644 +--- a/arch/x86/include/asm/pgtable_64_types.h ++++ b/arch/x86/include/asm/pgtable_64_types.h +@@ -61,6 +61,8 @@ typedef struct { pteval_t pte; } pte_t; + #define MODULES_VADDR (__START_KERNEL_map + KERNEL_IMAGE_SIZE) + #define MODULES_END _AC(0xffffffffff000000, UL) + #define MODULES_LEN (MODULES_END - MODULES_VADDR) ++#define ESPFIX_PGD_ENTRY _AC(-2, UL) ++#define ESPFIX_BASE_ADDR (ESPFIX_PGD_ENTRY << PGDIR_SHIFT) + + #define EARLY_DYNAMIC_PAGE_TABLES 64 + +diff --git a/arch/x86/include/asm/setup.h b/arch/x86/include/asm/setup.h +index d62c9f809bc5..75b14ca135be 100644 +--- a/arch/x86/include/asm/setup.h ++++ b/arch/x86/include/asm/setup.h +@@ -65,6 +65,8 @@ static inline void x86_ce4100_early_setup(void) { } + + #ifndef _SETUP + ++#include ++ + /* + * This is set up by the setup-routine at boot-time + */ +diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile +index cb648c84b327..56bac868cb91 100644 +--- a/arch/x86/kernel/Makefile ++++ b/arch/x86/kernel/Makefile +@@ -29,6 +29,7 @@ obj-$(CONFIG_X86_64) += sys_x86_64.o x8664_ksyms_64.o + obj-y += syscall_$(BITS).o + obj-$(CONFIG_X86_64) += vsyscall_64.o + obj-$(CONFIG_X86_64) += vsyscall_emu_64.o ++obj-$(CONFIG_X86_ESPFIX64) += espfix_64.o + obj-$(CONFIG_SYSFS) += ksysfs.o + obj-y += bootflag.o e820.o + obj-y += pci-dma.o quirks.o topology.o kdebugfs.o +diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S +index c87810b1b557..c5a9cb94dee6 100644 +--- a/arch/x86/kernel/entry_32.S ++++ b/arch/x86/kernel/entry_32.S +@@ -529,6 +529,7 @@ syscall_exit: + restore_all: + TRACE_IRQS_IRET + restore_all_notrace: ++#ifdef CONFIG_X86_ESPFIX32 + movl PT_EFLAGS(%esp), %eax # mix EFLAGS, SS and CS + # Warning: PT_OLDSS(%esp) contains the wrong/random values if we + # are returning to the kernel. +@@ -539,6 +540,7 @@ restore_all_notrace: + cmpl $((SEGMENT_LDT << 8) | USER_RPL), %eax + CFI_REMEMBER_STATE + je ldt_ss # returning to user-space with LDT SS ++#endif + restore_nocheck: + RESTORE_REGS 4 # skip orig_eax/error_code + irq_return: +@@ -551,6 +553,7 @@ ENTRY(iret_exc) + .previous + _ASM_EXTABLE(irq_return,iret_exc) + ++#ifdef CONFIG_X86_ESPFIX32 + CFI_RESTORE_STATE + ldt_ss: + #ifdef CONFIG_PARAVIRT +@@ -594,6 +597,7 @@ ldt_ss: + lss (%esp), %esp /* switch to espfix segment */ + CFI_ADJUST_CFA_OFFSET -8 + jmp restore_nocheck ++#endif + CFI_ENDPROC + ENDPROC(system_call) + +@@ -706,6 +710,7 @@ END(syscall_badsys) + * the high word of the segment base from the GDT and swiches to the + * normal stack and adjusts ESP with the matching offset. + */ ++#ifdef CONFIG_X86_ESPFIX32 + /* fixup the stack */ + mov GDT_ESPFIX_SS + 4, %al /* bits 16..23 */ + mov GDT_ESPFIX_SS + 7, %ah /* bits 24..31 */ +@@ -715,8 +720,10 @@ END(syscall_badsys) + pushl_cfi %eax + lss (%esp), %esp /* switch to the normal stack segment */ + CFI_ADJUST_CFA_OFFSET -8 ++#endif + .endm + .macro UNWIND_ESPFIX_STACK ++#ifdef CONFIG_X86_ESPFIX32 + movl %ss, %eax + /* see if on espfix stack */ + cmpw $__ESPFIX_SS, %ax +@@ -727,6 +734,7 @@ END(syscall_badsys) + /* switch to normal stack */ + FIXUP_ESPFIX_STACK + 27: ++#endif + .endm + + /* +@@ -1357,11 +1365,13 @@ END(debug) + ENTRY(nmi) + RING0_INT_FRAME + ASM_CLAC ++#ifdef CONFIG_X86_ESPFIX32 + pushl_cfi %eax + movl %ss, %eax + cmpw $__ESPFIX_SS, %ax + popl_cfi %eax + je nmi_espfix_stack ++#endif + cmpl $ia32_sysenter_target,(%esp) + je nmi_stack_fixup + pushl_cfi %eax +@@ -1401,6 +1411,7 @@ nmi_debug_stack_check: + FIX_STACK 24, nmi_stack_correct, 1 + jmp nmi_stack_correct + ++#ifdef CONFIG_X86_ESPFIX32 + nmi_espfix_stack: + /* We have a RING0_INT_FRAME here. + * +@@ -1422,6 +1433,7 @@ nmi_espfix_stack: + lss 12+4(%esp), %esp # back to espfix stack + CFI_ADJUST_CFA_OFFSET -24 + jmp irq_return ++#endif + CFI_ENDPROC + END(nmi) + +diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S +index 1e96c3628bf2..03cd2a8f6009 100644 +--- a/arch/x86/kernel/entry_64.S ++++ b/arch/x86/kernel/entry_64.S +@@ -58,6 +58,7 @@ + #include + #include + #include ++#include + #include + + /* Avoid __ASSEMBLER__'ifying just for this. */ +@@ -1041,12 +1042,45 @@ restore_args: + + irq_return: + INTERRUPT_RETURN +- _ASM_EXTABLE(irq_return, bad_iret) + +-#ifdef CONFIG_PARAVIRT + ENTRY(native_iret) ++ /* ++ * Are we returning to a stack segment from the LDT? Note: in ++ * 64-bit mode SS:RSP on the exception stack is always valid. ++ */ ++#ifdef CONFIG_X86_ESPFIX64 ++ testb $4,(SS-RIP)(%rsp) ++ jnz native_irq_return_ldt ++#endif ++ ++native_irq_return_iret: + iretq +- _ASM_EXTABLE(native_iret, bad_iret) ++ _ASM_EXTABLE(native_irq_return_iret, bad_iret) ++ ++#ifdef CONFIG_X86_ESPFIX64 ++native_irq_return_ldt: ++ pushq_cfi %rax ++ pushq_cfi %rdi ++ SWAPGS ++ movq PER_CPU_VAR(espfix_waddr),%rdi ++ movq %rax,(0*8)(%rdi) /* RAX */ ++ movq (2*8)(%rsp),%rax /* RIP */ ++ movq %rax,(1*8)(%rdi) ++ movq (3*8)(%rsp),%rax /* CS */ ++ movq %rax,(2*8)(%rdi) ++ movq (4*8)(%rsp),%rax /* RFLAGS */ ++ movq %rax,(3*8)(%rdi) ++ movq (6*8)(%rsp),%rax /* SS */ ++ movq %rax,(5*8)(%rdi) ++ movq (5*8)(%rsp),%rax /* RSP */ ++ movq %rax,(4*8)(%rdi) ++ andl $0xffff0000,%eax ++ popq_cfi %rdi ++ orq PER_CPU_VAR(espfix_stack),%rax ++ SWAPGS ++ movq %rax,%rsp ++ popq_cfi %rax ++ jmp native_irq_return_iret + #endif + + .section .fixup,"ax" +@@ -1110,9 +1144,40 @@ ENTRY(retint_kernel) + call preempt_schedule_irq + jmp exit_intr + #endif +- + CFI_ENDPROC + END(common_interrupt) ++ ++ /* ++ * If IRET takes a fault on the espfix stack, then we ++ * end up promoting it to a doublefault. In that case, ++ * modify the stack to make it look like we just entered ++ * the #GP handler from user space, similar to bad_iret. ++ */ ++#ifdef CONFIG_X86_ESPFIX64 ++ ALIGN ++__do_double_fault: ++ XCPT_FRAME 1 RDI+8 ++ movq RSP(%rdi),%rax /* Trap on the espfix stack? */ ++ sarq $PGDIR_SHIFT,%rax ++ cmpl $ESPFIX_PGD_ENTRY,%eax ++ jne do_double_fault /* No, just deliver the fault */ ++ cmpl $__KERNEL_CS,CS(%rdi) ++ jne do_double_fault ++ movq RIP(%rdi),%rax ++ cmpq $native_irq_return_iret,%rax ++ jne do_double_fault /* This shouldn't happen... */ ++ movq PER_CPU_VAR(kernel_stack),%rax ++ subq $(6*8-KERNEL_STACK_OFFSET),%rax /* Reset to original stack */ ++ movq %rax,RSP(%rdi) ++ movq $0,(%rax) /* Missing (lost) #GP error code */ ++ movq $general_protection,RIP(%rdi) ++ retq ++ CFI_ENDPROC ++END(__do_double_fault) ++#else ++# define __do_double_fault do_double_fault ++#endif ++ + /* + * End of kprobes section + */ +@@ -1314,7 +1379,7 @@ zeroentry overflow do_overflow + zeroentry bounds do_bounds + zeroentry invalid_op do_invalid_op + zeroentry device_not_available do_device_not_available +-paranoiderrorentry double_fault do_double_fault ++paranoiderrorentry double_fault __do_double_fault + zeroentry coprocessor_segment_overrun do_coprocessor_segment_overrun + errorentry invalid_TSS do_invalid_TSS + errorentry segment_not_present do_segment_not_present +@@ -1601,7 +1666,7 @@ error_sti: + */ + error_kernelspace: + incl %ebx +- leaq irq_return(%rip),%rcx ++ leaq native_irq_return_iret(%rip),%rcx + cmpq %rcx,RIP+8(%rsp) + je error_swapgs + movl %ecx,%eax /* zero extend */ +diff --git a/arch/x86/kernel/espfix_64.c b/arch/x86/kernel/espfix_64.c +new file mode 100644 +index 000000000000..94d857fb1033 +--- /dev/null ++++ b/arch/x86/kernel/espfix_64.c +@@ -0,0 +1,208 @@ ++/* ----------------------------------------------------------------------- * ++ * ++ * Copyright 2014 Intel Corporation; author: H. Peter Anvin ++ * ++ * This program is free software; you can redistribute it and/or modify it ++ * under the terms and conditions of the GNU General Public License, ++ * version 2, as published by the Free Software Foundation. ++ * ++ * This program is distributed in the hope it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for ++ * more details. ++ * ++ * ----------------------------------------------------------------------- */ ++ ++/* ++ * The IRET instruction, when returning to a 16-bit segment, only ++ * restores the bottom 16 bits of the user space stack pointer. This ++ * causes some 16-bit software to break, but it also leaks kernel state ++ * to user space. ++ * ++ * This works around this by creating percpu "ministacks", each of which ++ * is mapped 2^16 times 64K apart. When we detect that the return SS is ++ * on the LDT, we copy the IRET frame to the ministack and use the ++ * relevant alias to return to userspace. The ministacks are mapped ++ * readonly, so if the IRET fault we promote #GP to #DF which is an IST ++ * vector and thus has its own stack; we then do the fixup in the #DF ++ * handler. ++ * ++ * This file sets up the ministacks and the related page tables. The ++ * actual ministack invocation is in entry_64.S. ++ */ ++ ++#include ++#include ++#include ++#include ++#include ++#include ++#include ++#include ++#include ++#include ++ ++/* ++ * Note: we only need 6*8 = 48 bytes for the espfix stack, but round ++ * it up to a cache line to avoid unnecessary sharing. ++ */ ++#define ESPFIX_STACK_SIZE (8*8UL) ++#define ESPFIX_STACKS_PER_PAGE (PAGE_SIZE/ESPFIX_STACK_SIZE) ++ ++/* There is address space for how many espfix pages? */ ++#define ESPFIX_PAGE_SPACE (1UL << (PGDIR_SHIFT-PAGE_SHIFT-16)) ++ ++#define ESPFIX_MAX_CPUS (ESPFIX_STACKS_PER_PAGE * ESPFIX_PAGE_SPACE) ++#if CONFIG_NR_CPUS > ESPFIX_MAX_CPUS ++# error "Need more than one PGD for the ESPFIX hack" ++#endif ++ ++#define PGALLOC_GFP (GFP_KERNEL | __GFP_NOTRACK | __GFP_REPEAT | __GFP_ZERO) ++ ++/* This contains the *bottom* address of the espfix stack */ ++DEFINE_PER_CPU_READ_MOSTLY(unsigned long, espfix_stack); ++DEFINE_PER_CPU_READ_MOSTLY(unsigned long, espfix_waddr); ++ ++/* Initialization mutex - should this be a spinlock? */ ++static DEFINE_MUTEX(espfix_init_mutex); ++ ++/* Page allocation bitmap - each page serves ESPFIX_STACKS_PER_PAGE CPUs */ ++#define ESPFIX_MAX_PAGES DIV_ROUND_UP(CONFIG_NR_CPUS, ESPFIX_STACKS_PER_PAGE) ++static void *espfix_pages[ESPFIX_MAX_PAGES]; ++ ++static __page_aligned_bss pud_t espfix_pud_page[PTRS_PER_PUD] ++ __aligned(PAGE_SIZE); ++ ++static unsigned int page_random, slot_random; ++ ++/* ++ * This returns the bottom address of the espfix stack for a specific CPU. ++ * The math allows for a non-power-of-two ESPFIX_STACK_SIZE, in which case ++ * we have to account for some amount of padding at the end of each page. ++ */ ++static inline unsigned long espfix_base_addr(unsigned int cpu) ++{ ++ unsigned long page, slot; ++ unsigned long addr; ++ ++ page = (cpu / ESPFIX_STACKS_PER_PAGE) ^ page_random; ++ slot = (cpu + slot_random) % ESPFIX_STACKS_PER_PAGE; ++ addr = (page << PAGE_SHIFT) + (slot * ESPFIX_STACK_SIZE); ++ addr = (addr & 0xffffUL) | ((addr & ~0xffffUL) << 16); ++ addr += ESPFIX_BASE_ADDR; ++ return addr; ++} ++ ++#define PTE_STRIDE (65536/PAGE_SIZE) ++#define ESPFIX_PTE_CLONES (PTRS_PER_PTE/PTE_STRIDE) ++#define ESPFIX_PMD_CLONES PTRS_PER_PMD ++#define ESPFIX_PUD_CLONES (65536/(ESPFIX_PTE_CLONES*ESPFIX_PMD_CLONES)) ++ ++#define PGTABLE_PROT ((_KERNPG_TABLE & ~_PAGE_RW) | _PAGE_NX) ++ ++static void init_espfix_random(void) ++{ ++ unsigned long rand; ++ ++ /* ++ * This is run before the entropy pools are initialized, ++ * but this is hopefully better than nothing. ++ */ ++ if (!arch_get_random_long(&rand)) { ++ /* The constant is an arbitrary large prime */ ++ rdtscll(rand); ++ rand *= 0xc345c6b72fd16123UL; ++ } ++ ++ slot_random = rand % ESPFIX_STACKS_PER_PAGE; ++ page_random = (rand / ESPFIX_STACKS_PER_PAGE) ++ & (ESPFIX_PAGE_SPACE - 1); ++} ++ ++void __init init_espfix_bsp(void) ++{ ++ pgd_t *pgd_p; ++ pteval_t ptemask; ++ ++ ptemask = __supported_pte_mask; ++ ++ /* Install the espfix pud into the kernel page directory */ ++ pgd_p = &init_level4_pgt[pgd_index(ESPFIX_BASE_ADDR)]; ++ pgd_populate(&init_mm, pgd_p, (pud_t *)espfix_pud_page); ++ ++ /* Randomize the locations */ ++ init_espfix_random(); ++ ++ /* The rest is the same as for any other processor */ ++ init_espfix_ap(); ++} ++ ++void init_espfix_ap(void) ++{ ++ unsigned int cpu, page; ++ unsigned long addr; ++ pud_t pud, *pud_p; ++ pmd_t pmd, *pmd_p; ++ pte_t pte, *pte_p; ++ int n; ++ void *stack_page; ++ pteval_t ptemask; ++ ++ /* We only have to do this once... */ ++ if (likely(this_cpu_read(espfix_stack))) ++ return; /* Already initialized */ ++ ++ cpu = smp_processor_id(); ++ addr = espfix_base_addr(cpu); ++ page = cpu/ESPFIX_STACKS_PER_PAGE; ++ ++ /* Did another CPU already set this up? */ ++ stack_page = ACCESS_ONCE(espfix_pages[page]); ++ if (likely(stack_page)) ++ goto done; ++ ++ mutex_lock(&espfix_init_mutex); ++ ++ /* Did we race on the lock? */ ++ stack_page = ACCESS_ONCE(espfix_pages[page]); ++ if (stack_page) ++ goto unlock_done; ++ ++ ptemask = __supported_pte_mask; ++ ++ pud_p = &espfix_pud_page[pud_index(addr)]; ++ pud = *pud_p; ++ if (!pud_present(pud)) { ++ pmd_p = (pmd_t *)__get_free_page(PGALLOC_GFP); ++ pud = __pud(__pa(pmd_p) | (PGTABLE_PROT & ptemask)); ++ paravirt_alloc_pmd(&init_mm, __pa(pmd_p) >> PAGE_SHIFT); ++ for (n = 0; n < ESPFIX_PUD_CLONES; n++) ++ set_pud(&pud_p[n], pud); ++ } ++ ++ pmd_p = pmd_offset(&pud, addr); ++ pmd = *pmd_p; ++ if (!pmd_present(pmd)) { ++ pte_p = (pte_t *)__get_free_page(PGALLOC_GFP); ++ pmd = __pmd(__pa(pte_p) | (PGTABLE_PROT & ptemask)); ++ paravirt_alloc_pte(&init_mm, __pa(pte_p) >> PAGE_SHIFT); ++ for (n = 0; n < ESPFIX_PMD_CLONES; n++) ++ set_pmd(&pmd_p[n], pmd); ++ } ++ ++ pte_p = pte_offset_kernel(&pmd, addr); ++ stack_page = (void *)__get_free_page(GFP_KERNEL); ++ pte = __pte(__pa(stack_page) | (__PAGE_KERNEL_RO & ptemask)); ++ for (n = 0; n < ESPFIX_PTE_CLONES; n++) ++ set_pte(&pte_p[n*PTE_STRIDE], pte); ++ ++ /* Job is done for this CPU and any CPU which shares this page */ ++ ACCESS_ONCE(espfix_pages[page]) = stack_page; ++ ++unlock_done: ++ mutex_unlock(&espfix_init_mutex); ++done: ++ this_cpu_write(espfix_stack, addr); ++ this_cpu_write(espfix_waddr, (unsigned long)stack_page ++ + (addr & ~PAGE_MASK)); ++} +diff --git a/arch/x86/kernel/ldt.c b/arch/x86/kernel/ldt.c +index dcbbaa165bde..c37886d759cc 100644 +--- a/arch/x86/kernel/ldt.c ++++ b/arch/x86/kernel/ldt.c +@@ -20,8 +20,6 @@ + #include + #include + +-int sysctl_ldt16 = 0; +- + #ifdef CONFIG_SMP + static void flush_ldt(void *current_mm) + { +@@ -231,16 +229,10 @@ static int write_ldt(void __user *ptr, unsigned long bytecount, int oldmode) + } + } + +- /* +- * On x86-64 we do not support 16-bit segments due to +- * IRET leaking the high bits of the kernel stack address. +- */ +-#ifdef CONFIG_X86_64 +- if (!ldt_info.seg_32bit && !sysctl_ldt16) { ++ if (!IS_ENABLED(CONFIG_X86_16BIT) && !ldt_info.seg_32bit) { + error = -EINVAL; + goto out_unlock; + } +-#endif + + fill_ldt(&ldt, &ldt_info); + if (oldmode) +diff --git a/arch/x86/kernel/paravirt_patch_64.c b/arch/x86/kernel/paravirt_patch_64.c +index 3f08f34f93eb..a1da6737ba5b 100644 +--- a/arch/x86/kernel/paravirt_patch_64.c ++++ b/arch/x86/kernel/paravirt_patch_64.c +@@ -6,7 +6,6 @@ DEF_NATIVE(pv_irq_ops, irq_disable, "cli"); + DEF_NATIVE(pv_irq_ops, irq_enable, "sti"); + DEF_NATIVE(pv_irq_ops, restore_fl, "pushq %rdi; popfq"); + DEF_NATIVE(pv_irq_ops, save_fl, "pushfq; popq %rax"); +-DEF_NATIVE(pv_cpu_ops, iret, "iretq"); + DEF_NATIVE(pv_mmu_ops, read_cr2, "movq %cr2, %rax"); + DEF_NATIVE(pv_mmu_ops, read_cr3, "movq %cr3, %rax"); + DEF_NATIVE(pv_mmu_ops, write_cr3, "movq %rdi, %cr3"); +@@ -50,7 +49,6 @@ unsigned native_patch(u8 type, u16 clobbers, void *ibuf, + PATCH_SITE(pv_irq_ops, save_fl); + PATCH_SITE(pv_irq_ops, irq_enable); + PATCH_SITE(pv_irq_ops, irq_disable); +- PATCH_SITE(pv_cpu_ops, iret); + PATCH_SITE(pv_cpu_ops, irq_enable_sysexit); + PATCH_SITE(pv_cpu_ops, usergs_sysret32); + PATCH_SITE(pv_cpu_ops, usergs_sysret64); +diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c +index a32da804252e..395be6d8bbde 100644 +--- a/arch/x86/kernel/smpboot.c ++++ b/arch/x86/kernel/smpboot.c +@@ -243,6 +243,13 @@ static void notrace start_secondary(void *unused) + check_tsc_sync_target(); + + /* ++ * Enable the espfix hack for this CPU ++ */ ++#ifdef CONFIG_X86_ESPFIX64 ++ init_espfix_ap(); ++#endif ++ ++ /* + * We need to hold vector_lock so there the set of online cpus + * does not change while we are assigning vectors to cpus. Holding + * this lock ensures we don't half assign or remove an irq from a cpu. +diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c +index 0002a3a33081..3620928631ce 100644 +--- a/arch/x86/mm/dump_pagetables.c ++++ b/arch/x86/mm/dump_pagetables.c +@@ -30,11 +30,13 @@ struct pg_state { + unsigned long start_address; + unsigned long current_address; + const struct addr_marker *marker; ++ unsigned long lines; + }; + + struct addr_marker { + unsigned long start_address; + const char *name; ++ unsigned long max_lines; + }; + + /* indices for address_markers; keep sync'd w/ address_markers below */ +@@ -45,6 +47,7 @@ enum address_markers_idx { + LOW_KERNEL_NR, + VMALLOC_START_NR, + VMEMMAP_START_NR, ++ ESPFIX_START_NR, + HIGH_KERNEL_NR, + MODULES_VADDR_NR, + MODULES_END_NR, +@@ -67,6 +70,7 @@ static struct addr_marker address_markers[] = { + { PAGE_OFFSET, "Low Kernel Mapping" }, + { VMALLOC_START, "vmalloc() Area" }, + { VMEMMAP_START, "Vmemmap" }, ++ { ESPFIX_BASE_ADDR, "ESPfix Area", 16 }, + { __START_KERNEL_map, "High Kernel Mapping" }, + { MODULES_VADDR, "Modules" }, + { MODULES_END, "End Modules" }, +@@ -163,7 +167,7 @@ static void note_page(struct seq_file *m, struct pg_state *st, + pgprot_t new_prot, int level) + { + pgprotval_t prot, cur; +- static const char units[] = "KMGTPE"; ++ static const char units[] = "BKMGTPE"; + + /* + * If we have a "break" in the series, we need to flush the state that +@@ -178,6 +182,7 @@ static void note_page(struct seq_file *m, struct pg_state *st, + st->current_prot = new_prot; + st->level = level; + st->marker = address_markers; ++ st->lines = 0; + seq_printf(m, "---[ %s ]---\n", st->marker->name); + } else if (prot != cur || level != st->level || + st->current_address >= st->marker[1].start_address) { +@@ -188,17 +193,21 @@ static void note_page(struct seq_file *m, struct pg_state *st, + /* + * Now print the actual finished series + */ +- seq_printf(m, "0x%0*lx-0x%0*lx ", +- width, st->start_address, +- width, st->current_address); +- +- delta = (st->current_address - st->start_address) >> 10; +- while (!(delta & 1023) && unit[1]) { +- delta >>= 10; +- unit++; ++ if (!st->marker->max_lines || ++ st->lines < st->marker->max_lines) { ++ seq_printf(m, "0x%0*lx-0x%0*lx ", ++ width, st->start_address, ++ width, st->current_address); ++ ++ delta = (st->current_address - st->start_address) >> 10; ++ while (!(delta & 1023) && unit[1]) { ++ delta >>= 10; ++ unit++; ++ } ++ seq_printf(m, "%9lu%c ", delta, *unit); ++ printk_prot(m, st->current_prot, st->level); + } +- seq_printf(m, "%9lu%c ", delta, *unit); +- printk_prot(m, st->current_prot, st->level); ++ st->lines++; + + /* + * We print markers for special areas of address space, +diff --git a/arch/x86/vdso/vdso32-setup.c b/arch/x86/vdso/vdso32-setup.c +index f1d633a43f8e..d6bfb876cfb0 100644 +--- a/arch/x86/vdso/vdso32-setup.c ++++ b/arch/x86/vdso/vdso32-setup.c +@@ -41,7 +41,6 @@ enum { + #ifdef CONFIG_X86_64 + #define vdso_enabled sysctl_vsyscall32 + #define arch_setup_additional_pages syscall32_setup_pages +-extern int sysctl_ldt16; + #endif + + /* +@@ -381,13 +380,6 @@ static struct ctl_table abi_table2[] = { + .mode = 0644, + .proc_handler = proc_dointvec + }, +- { +- .procname = "ldt16", +- .data = &sysctl_ldt16, +- .maxlen = sizeof(int), +- .mode = 0644, +- .proc_handler = proc_dointvec +- }, + {} + }; + +diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c +index 0982233b9b84..a6a72ce8630f 100644 +--- a/arch/x86/xen/setup.c ++++ b/arch/x86/xen/setup.c +@@ -574,13 +574,7 @@ void xen_enable_syscall(void) + } + #endif /* CONFIG_X86_64 */ + } +-void xen_enable_nmi(void) +-{ +-#ifdef CONFIG_X86_64 +- if (register_callback(CALLBACKTYPE_nmi, (char *)nmi)) +- BUG(); +-#endif +-} ++ + void __init xen_pvmmu_arch_setup(void) + { + HYPERVISOR_vm_assist(VMASST_CMD_enable, VMASST_TYPE_4gb_segments); +@@ -595,7 +589,6 @@ void __init xen_pvmmu_arch_setup(void) + + xen_enable_sysenter(); + xen_enable_syscall(); +- xen_enable_nmi(); + } + + /* This function is not called for HVM domains */ +diff --git a/arch/xtensa/kernel/vectors.S b/arch/xtensa/kernel/vectors.S +index f9e1ec346e35..8453e6e39895 100644 +--- a/arch/xtensa/kernel/vectors.S ++++ b/arch/xtensa/kernel/vectors.S +@@ -376,38 +376,42 @@ _DoubleExceptionVector_WindowOverflow: + beqz a2, 1f # if at start of vector, don't restore + + addi a0, a0, -128 +- bbsi a0, 8, 1f # don't restore except for overflow 8 and 12 +- bbsi a0, 7, 2f ++ bbsi.l a0, 8, 1f # don't restore except for overflow 8 and 12 ++ ++ /* ++ * This fixup handler is for the extremely unlikely case where the ++ * overflow handler's reference thru a0 gets a hardware TLB refill ++ * that bumps out the (distinct, aliasing) TLB entry that mapped its ++ * prior references thru a9/a13, and where our reference now thru ++ * a9/a13 gets a 2nd-level miss exception (not hardware TLB refill). ++ */ ++ movi a2, window_overflow_restore_a0_fixup ++ s32i a2, a3, EXC_TABLE_FIXUP ++ l32i a2, a3, EXC_TABLE_DOUBLE_SAVE ++ xsr a3, excsave1 ++ ++ bbsi.l a0, 7, 2f + + /* + * Restore a0 as saved by _WindowOverflow8(). +- * +- * FIXME: we really need a fixup handler for this L32E, +- * for the extremely unlikely case where the overflow handler's +- * reference thru a0 gets a hardware TLB refill that bumps out +- * the (distinct, aliasing) TLB entry that mapped its prior +- * references thru a9, and where our reference now thru a9 +- * gets a 2nd-level miss exception (not hardware TLB refill). + */ + +- l32e a2, a9, -16 +- wsr a2, depc # replace the saved a0 +- j 1f ++ l32e a0, a9, -16 ++ wsr a0, depc # replace the saved a0 ++ j 3f + + 2: + /* + * Restore a0 as saved by _WindowOverflow12(). +- * +- * FIXME: we really need a fixup handler for this L32E, +- * for the extremely unlikely case where the overflow handler's +- * reference thru a0 gets a hardware TLB refill that bumps out +- * the (distinct, aliasing) TLB entry that mapped its prior +- * references thru a13, and where our reference now thru a13 +- * gets a 2nd-level miss exception (not hardware TLB refill). + */ + +- l32e a2, a13, -16 +- wsr a2, depc # replace the saved a0 ++ l32e a0, a13, -16 ++ wsr a0, depc # replace the saved a0 ++3: ++ xsr a3, excsave1 ++ movi a0, 0 ++ s32i a0, a3, EXC_TABLE_FIXUP ++ s32i a2, a3, EXC_TABLE_DOUBLE_SAVE + 1: + /* + * Restore WindowBase while leaving all address registers restored. +@@ -449,6 +453,7 @@ _DoubleExceptionVector_WindowOverflow: + + s32i a0, a2, PT_DEPC + ++_DoubleExceptionVector_handle_exception: + addx4 a0, a0, a3 + l32i a0, a0, EXC_TABLE_FAST_USER + xsr a3, excsave1 +@@ -464,11 +469,120 @@ _DoubleExceptionVector_WindowOverflow: + rotw -3 + j 1b + +- .end literal_prefix + + ENDPROC(_DoubleExceptionVector) + + /* ++ * Fixup handler for TLB miss in double exception handler for window owerflow. ++ * We get here with windowbase set to the window that was being spilled and ++ * a0 trashed. a0 bit 7 determines if this is a call8 (bit clear) or call12 ++ * (bit set) window. ++ * ++ * We do the following here: ++ * - go to the original window retaining a0 value; ++ * - set up exception stack to return back to appropriate a0 restore code ++ * (we'll need to rotate window back and there's no place to save this ++ * information, use different return address for that); ++ * - handle the exception; ++ * - go to the window that was being spilled; ++ * - set up window_overflow_restore_a0_fixup as a fixup routine; ++ * - reload a0; ++ * - restore the original window; ++ * - reset the default fixup routine; ++ * - return to user. By the time we get to this fixup handler all information ++ * about the conditions of the original double exception that happened in ++ * the window overflow handler is lost, so we just return to userspace to ++ * retry overflow from start. ++ * ++ * a0: value of depc, original value in depc ++ * a2: trashed, original value in EXC_TABLE_DOUBLE_SAVE ++ * a3: exctable, original value in excsave1 ++ */ ++ ++ENTRY(window_overflow_restore_a0_fixup) ++ ++ rsr a0, ps ++ extui a0, a0, PS_OWB_SHIFT, PS_OWB_WIDTH ++ rsr a2, windowbase ++ sub a0, a2, a0 ++ extui a0, a0, 0, 3 ++ l32i a2, a3, EXC_TABLE_DOUBLE_SAVE ++ xsr a3, excsave1 ++ ++ _beqi a0, 1, .Lhandle_1 ++ _beqi a0, 3, .Lhandle_3 ++ ++ .macro overflow_fixup_handle_exception_pane n ++ ++ rsr a0, depc ++ rotw -\n ++ ++ xsr a3, excsave1 ++ wsr a2, depc ++ l32i a2, a3, EXC_TABLE_KSTK ++ s32i a0, a2, PT_AREG0 ++ ++ movi a0, .Lrestore_\n ++ s32i a0, a2, PT_DEPC ++ rsr a0, exccause ++ j _DoubleExceptionVector_handle_exception ++ ++ .endm ++ ++ overflow_fixup_handle_exception_pane 2 ++.Lhandle_1: ++ overflow_fixup_handle_exception_pane 1 ++.Lhandle_3: ++ overflow_fixup_handle_exception_pane 3 ++ ++ .macro overflow_fixup_restore_a0_pane n ++ ++ rotw \n ++ /* Need to preserve a0 value here to be able to handle exception ++ * that may occur on a0 reload from stack. It may occur because ++ * TLB miss handler may not be atomic and pointer to page table ++ * may be lost before we get here. There are no free registers, ++ * so we need to use EXC_TABLE_DOUBLE_SAVE area. ++ */ ++ xsr a3, excsave1 ++ s32i a2, a3, EXC_TABLE_DOUBLE_SAVE ++ movi a2, window_overflow_restore_a0_fixup ++ s32i a2, a3, EXC_TABLE_FIXUP ++ l32i a2, a3, EXC_TABLE_DOUBLE_SAVE ++ xsr a3, excsave1 ++ bbsi.l a0, 7, 1f ++ l32e a0, a9, -16 ++ j 2f ++1: ++ l32e a0, a13, -16 ++2: ++ rotw -\n ++ ++ .endm ++ ++.Lrestore_2: ++ overflow_fixup_restore_a0_pane 2 ++ ++.Lset_default_fixup: ++ xsr a3, excsave1 ++ s32i a2, a3, EXC_TABLE_DOUBLE_SAVE ++ movi a2, 0 ++ s32i a2, a3, EXC_TABLE_FIXUP ++ l32i a2, a3, EXC_TABLE_DOUBLE_SAVE ++ xsr a3, excsave1 ++ rfe ++ ++.Lrestore_1: ++ overflow_fixup_restore_a0_pane 1 ++ j .Lset_default_fixup ++.Lrestore_3: ++ overflow_fixup_restore_a0_pane 3 ++ j .Lset_default_fixup ++ ++ENDPROC(window_overflow_restore_a0_fixup) ++ ++ .end literal_prefix ++/* + * Debug interrupt vector + * + * There is not much space here, so simply jump to another handler. +diff --git a/arch/xtensa/kernel/vmlinux.lds.S b/arch/xtensa/kernel/vmlinux.lds.S +index ee32c0085dff..d16db6df86f8 100644 +--- a/arch/xtensa/kernel/vmlinux.lds.S ++++ b/arch/xtensa/kernel/vmlinux.lds.S +@@ -269,13 +269,13 @@ SECTIONS + .UserExceptionVector.literal) + SECTION_VECTOR (_DoubleExceptionVector_literal, + .DoubleExceptionVector.literal, +- DOUBLEEXC_VECTOR_VADDR - 16, ++ DOUBLEEXC_VECTOR_VADDR - 40, + SIZEOF(.UserExceptionVector.text), + .UserExceptionVector.text) + SECTION_VECTOR (_DoubleExceptionVector_text, + .DoubleExceptionVector.text, + DOUBLEEXC_VECTOR_VADDR, +- 32, ++ 40, + .DoubleExceptionVector.literal) + + . = (LOADADDR( .DoubleExceptionVector.text ) + SIZEOF( .DoubleExceptionVector.text ) + 3) & ~ 3; +diff --git a/crypto/af_alg.c b/crypto/af_alg.c +index 966f893711b3..6a3ad8011585 100644 +--- a/crypto/af_alg.c ++++ b/crypto/af_alg.c +@@ -21,6 +21,7 @@ + #include + #include + #include ++#include + + struct alg_type_list { + const struct af_alg_type *type; +@@ -243,6 +244,7 @@ int af_alg_accept(struct sock *sk, struct socket *newsock) + + sock_init_data(newsock, sk2); + sock_graft(sk2, newsock); ++ security_sk_clone(sk, sk2); + + err = type->accept(ask->private, sk2); + if (err) { +diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c +index 199b52b7c3e1..153f4b92cc05 100644 +--- a/drivers/cpufreq/cpufreq.c ++++ b/drivers/cpufreq/cpufreq.c +@@ -1089,10 +1089,12 @@ static int __cpufreq_add_dev(struct device *dev, struct subsys_interface *sif, + * the creation of a brand new one. So we need to perform this update + * by invoking update_policy_cpu(). + */ +- if (frozen && cpu != policy->cpu) ++ if (frozen && cpu != policy->cpu) { + update_policy_cpu(policy, cpu); +- else ++ WARN_ON(kobject_move(&policy->kobj, &dev->kobj)); ++ } else { + policy->cpu = cpu; ++ } + + policy->governor = CPUFREQ_DEFAULT_GOVERNOR; + cpumask_copy(policy->cpus, cpumask_of(cpu)); +diff --git a/drivers/iio/accel/bma180.c b/drivers/iio/accel/bma180.c +index bfec313492b3..fe83d04784c8 100644 +--- a/drivers/iio/accel/bma180.c ++++ b/drivers/iio/accel/bma180.c +@@ -68,13 +68,13 @@ + /* Defaults values */ + #define BMA180_DEF_PMODE 0 + #define BMA180_DEF_BW 20 +-#define BMA180_DEF_SCALE 250 ++#define BMA180_DEF_SCALE 2452 + + /* Available values for sysfs */ + #define BMA180_FLP_FREQ_AVAILABLE \ + "10 20 40 75 150 300" + #define BMA180_SCALE_AVAILABLE \ +- "0.000130 0.000190 0.000250 0.000380 0.000500 0.000990 0.001980" ++ "0.001275 0.001863 0.002452 0.003727 0.004903 0.009709 0.019417" + + struct bma180_data { + struct i2c_client *client; +@@ -94,7 +94,7 @@ enum bma180_axis { + }; + + static int bw_table[] = { 10, 20, 40, 75, 150, 300 }; /* Hz */ +-static int scale_table[] = { 130, 190, 250, 380, 500, 990, 1980 }; ++static int scale_table[] = { 1275, 1863, 2452, 3727, 4903, 9709, 19417 }; + + static int bma180_get_acc_reg(struct bma180_data *data, enum bma180_axis axis) + { +@@ -376,6 +376,8 @@ static int bma180_write_raw(struct iio_dev *indio_dev, + mutex_unlock(&data->mutex); + return ret; + case IIO_CHAN_INFO_LOW_PASS_FILTER_3DB_FREQUENCY: ++ if (val2) ++ return -EINVAL; + mutex_lock(&data->mutex); + ret = bma180_set_bw(data, val); + mutex_unlock(&data->mutex); +diff --git a/drivers/iio/industrialio-buffer.c b/drivers/iio/industrialio-buffer.c +index fe25042f056a..0f1d9b2ccdfa 100644 +--- a/drivers/iio/industrialio-buffer.c ++++ b/drivers/iio/industrialio-buffer.c +@@ -953,7 +953,7 @@ static int iio_buffer_update_demux(struct iio_dev *indio_dev, + + /* Now we have the two masks, work from least sig and build up sizes */ + for_each_set_bit(out_ind, +- indio_dev->active_scan_mask, ++ buffer->scan_mask, + indio_dev->masklength) { + in_ind = find_next_bit(indio_dev->active_scan_mask, + indio_dev->masklength, +diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c +index 66c5d130c8c2..0e722c103562 100644 +--- a/drivers/md/dm-bufio.c ++++ b/drivers/md/dm-bufio.c +@@ -1541,7 +1541,7 @@ struct dm_bufio_client *dm_bufio_client_create(struct block_device *bdev, unsign + BUG_ON(block_size < 1 << SECTOR_SHIFT || + (block_size & (block_size - 1))); + +- c = kmalloc(sizeof(*c), GFP_KERNEL); ++ c = kzalloc(sizeof(*c), GFP_KERNEL); + if (!c) { + r = -ENOMEM; + goto bad_client; +diff --git a/drivers/md/dm-cache-target.c b/drivers/md/dm-cache-target.c +index c0ad90d91252..735e939a846d 100644 +--- a/drivers/md/dm-cache-target.c ++++ b/drivers/md/dm-cache-target.c +@@ -231,7 +231,7 @@ struct cache { + /* + * cache_size entries, dirty if set + */ +- dm_cblock_t nr_dirty; ++ atomic_t nr_dirty; + unsigned long *dirty_bitset; + + /* +@@ -493,7 +493,7 @@ static bool is_dirty(struct cache *cache, dm_cblock_t b) + static void set_dirty(struct cache *cache, dm_oblock_t oblock, dm_cblock_t cblock) + { + if (!test_and_set_bit(from_cblock(cblock), cache->dirty_bitset)) { +- cache->nr_dirty = to_cblock(from_cblock(cache->nr_dirty) + 1); ++ atomic_inc(&cache->nr_dirty); + policy_set_dirty(cache->policy, oblock); + } + } +@@ -502,8 +502,7 @@ static void clear_dirty(struct cache *cache, dm_oblock_t oblock, dm_cblock_t cbl + { + if (test_and_clear_bit(from_cblock(cblock), cache->dirty_bitset)) { + policy_clear_dirty(cache->policy, oblock); +- cache->nr_dirty = to_cblock(from_cblock(cache->nr_dirty) - 1); +- if (!from_cblock(cache->nr_dirty)) ++ if (atomic_dec_return(&cache->nr_dirty) == 0) + dm_table_event(cache->ti->table); + } + } +@@ -2286,7 +2285,7 @@ static int cache_create(struct cache_args *ca, struct cache **result) + atomic_set(&cache->quiescing_ack, 0); + + r = -ENOMEM; +- cache->nr_dirty = 0; ++ atomic_set(&cache->nr_dirty, 0); + cache->dirty_bitset = alloc_bitset(from_cblock(cache->cache_size)); + if (!cache->dirty_bitset) { + *error = "could not allocate dirty bitset"; +@@ -2828,7 +2827,7 @@ static void cache_status(struct dm_target *ti, status_type_t type, + + residency = policy_residency(cache->policy); + +- DMEMIT("%u %llu/%llu %u %llu/%llu %u %u %u %u %u %u %llu ", ++ DMEMIT("%u %llu/%llu %u %llu/%llu %u %u %u %u %u %u %lu ", + (unsigned)(DM_CACHE_METADATA_BLOCK_SIZE >> SECTOR_SHIFT), + (unsigned long long)(nr_blocks_metadata - nr_free_blocks_metadata), + (unsigned long long)nr_blocks_metadata, +@@ -2841,7 +2840,7 @@ static void cache_status(struct dm_target *ti, status_type_t type, + (unsigned) atomic_read(&cache->stats.write_miss), + (unsigned) atomic_read(&cache->stats.demotion), + (unsigned) atomic_read(&cache->stats.promotion), +- (unsigned long long) from_cblock(cache->nr_dirty)); ++ (unsigned long) atomic_read(&cache->nr_dirty)); + + if (writethrough_mode(&cache->features)) + DMEMIT("1 writethrough "); +diff --git a/drivers/net/wireless/ath/ath9k/xmit.c b/drivers/net/wireless/ath/ath9k/xmit.c +index 0526ddff977d..0fe7674ad100 100644 +--- a/drivers/net/wireless/ath/ath9k/xmit.c ++++ b/drivers/net/wireless/ath/ath9k/xmit.c +@@ -890,6 +890,15 @@ ath_tx_get_tid_subframe(struct ath_softc *sc, struct ath_txq *txq, + + tx_info = IEEE80211_SKB_CB(skb); + tx_info->flags &= ~IEEE80211_TX_CTL_CLEAR_PS_FILT; ++ ++ /* ++ * No aggregation session is running, but there may be frames ++ * from a previous session or a failed attempt in the queue. ++ * Send them out as normal data frames ++ */ ++ if (!tid->active) ++ tx_info->flags &= ~IEEE80211_TX_CTL_AMPDU; ++ + if (!(tx_info->flags & IEEE80211_TX_CTL_AMPDU)) { + bf->bf_state.bf_type = 0; + return bf; +diff --git a/drivers/pnp/pnpacpi/core.c b/drivers/pnp/pnpacpi/core.c +index c31aa07b3ba5..da1c6cb1a41e 100644 +--- a/drivers/pnp/pnpacpi/core.c ++++ b/drivers/pnp/pnpacpi/core.c +@@ -339,8 +339,7 @@ static int __init acpi_pnp_match(struct device *dev, void *_pnp) + struct pnp_dev *pnp = _pnp; + + /* true means it matched */ +- return !acpi->physical_node_count +- && compare_pnp_id(pnp->id, acpi_device_hid(acpi)); ++ return pnp->data == acpi; + } + + static struct acpi_device * __init acpi_pnp_find_companion(struct device *dev) +diff --git a/drivers/rapidio/devices/tsi721_dma.c b/drivers/rapidio/devices/tsi721_dma.c +index 91245f5dbe81..47257b6eea84 100644 +--- a/drivers/rapidio/devices/tsi721_dma.c ++++ b/drivers/rapidio/devices/tsi721_dma.c +@@ -287,6 +287,12 @@ struct tsi721_tx_desc *tsi721_desc_get(struct tsi721_bdma_chan *bdma_chan) + "desc %p not ACKed\n", tx_desc); + } + ++ if (ret == NULL) { ++ dev_dbg(bdma_chan->dchan.device->dev, ++ "%s: unable to obtain tx descriptor\n", __func__); ++ goto err_out; ++ } ++ + i = bdma_chan->wr_count_next % bdma_chan->bd_num; + if (i == bdma_chan->bd_num - 1) { + i = 0; +@@ -297,7 +303,7 @@ struct tsi721_tx_desc *tsi721_desc_get(struct tsi721_bdma_chan *bdma_chan) + tx_desc->txd.phys = bdma_chan->bd_phys + + i * sizeof(struct tsi721_dma_desc); + tx_desc->hw_desc = &((struct tsi721_dma_desc *)bdma_chan->bd_base)[i]; +- ++err_out: + spin_unlock_bh(&bdma_chan->lock); + + return ret; +diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c +index 62ec84b42e31..64e487a8bf59 100644 +--- a/drivers/scsi/scsi_lib.c ++++ b/drivers/scsi/scsi_lib.c +@@ -831,6 +831,14 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) + scsi_next_command(cmd); + return; + } ++ } else if (blk_rq_bytes(req) == 0 && result && !sense_deferred) { ++ /* ++ * Certain non BLOCK_PC requests are commands that don't ++ * actually transfer anything (FLUSH), so cannot use ++ * good_bytes != blk_rq_bytes(req) as the signal for an error. ++ * This sets the error explicitly for the problem case. ++ */ ++ error = __scsi_error_from_host_byte(cmd, result); + } + + /* no bidi support for !REQ_TYPE_BLOCK_PC yet */ +diff --git a/drivers/staging/vt6655/bssdb.c b/drivers/staging/vt6655/bssdb.c +index d7efd0173a9a..7d7578872a84 100644 +--- a/drivers/staging/vt6655/bssdb.c ++++ b/drivers/staging/vt6655/bssdb.c +@@ -983,7 +983,7 @@ start: + pDevice->byERPFlag &= ~(WLAN_SET_ERP_USE_PROTECTION(1)); + } + +- { ++ if (pDevice->eCommandState == WLAN_ASSOCIATE_WAIT) { + pDevice->byReAssocCount++; + /* 10 sec timeout */ + if ((pDevice->byReAssocCount > 10) && (!pDevice->bLinkPass)) { +diff --git a/drivers/staging/vt6655/device_main.c b/drivers/staging/vt6655/device_main.c +index a952df1bf9d6..6f13f0e597f8 100644 +--- a/drivers/staging/vt6655/device_main.c ++++ b/drivers/staging/vt6655/device_main.c +@@ -2430,6 +2430,7 @@ static irqreturn_t device_intr(int irq, void *dev_instance) { + int handled = 0; + unsigned char byData = 0; + int ii = 0; ++ unsigned long flags; + // unsigned char byRSSI; + + MACvReadISR(pDevice->PortOffset, &pDevice->dwIsr); +@@ -2455,7 +2456,8 @@ static irqreturn_t device_intr(int irq, void *dev_instance) { + + handled = 1; + MACvIntDisable(pDevice->PortOffset); +- spin_lock_irq(&pDevice->lock); ++ ++ spin_lock_irqsave(&pDevice->lock, flags); + + //Make sure current page is 0 + VNSvInPortB(pDevice->PortOffset + MAC_REG_PAGE1SEL, &byOrgPageSel); +@@ -2696,7 +2698,8 @@ static irqreturn_t device_intr(int irq, void *dev_instance) { + MACvSelectPage1(pDevice->PortOffset); + } + +- spin_unlock_irq(&pDevice->lock); ++ spin_unlock_irqrestore(&pDevice->lock, flags); ++ + MACvIntEnable(pDevice->PortOffset, IMR_MASK_VALUE); + + return IRQ_RETVAL(handled); +diff --git a/include/dt-bindings/pinctrl/dra.h b/include/dt-bindings/pinctrl/dra.h +index 002a2855c046..3d33794e4f3e 100644 +--- a/include/dt-bindings/pinctrl/dra.h ++++ b/include/dt-bindings/pinctrl/dra.h +@@ -30,7 +30,8 @@ + #define MUX_MODE14 0xe + #define MUX_MODE15 0xf + +-#define PULL_ENA (1 << 16) ++#define PULL_ENA (0 << 16) ++#define PULL_DIS (1 << 16) + #define PULL_UP (1 << 17) + #define INPUT_EN (1 << 18) + #define SLEWCONTROL (1 << 19) +@@ -38,10 +39,10 @@ + #define WAKEUP_EVENT (1 << 25) + + /* Active pin states */ +-#define PIN_OUTPUT 0 ++#define PIN_OUTPUT (0 | PULL_DIS) + #define PIN_OUTPUT_PULLUP (PIN_OUTPUT | PULL_ENA | PULL_UP) + #define PIN_OUTPUT_PULLDOWN (PIN_OUTPUT | PULL_ENA) +-#define PIN_INPUT INPUT_EN ++#define PIN_INPUT (INPUT_EN | PULL_DIS) + #define PIN_INPUT_SLEW (INPUT_EN | SLEWCONTROL) + #define PIN_INPUT_PULLUP (PULL_ENA | INPUT_EN | PULL_UP) + #define PIN_INPUT_PULLDOWN (PULL_ENA | INPUT_EN) +diff --git a/include/linux/printk.h b/include/linux/printk.h +index fa47e2708c01..cbf094f993f4 100644 +--- a/include/linux/printk.h ++++ b/include/linux/printk.h +@@ -132,9 +132,9 @@ asmlinkage __printf(1, 2) __cold + int printk(const char *fmt, ...); + + /* +- * Special printk facility for scheduler use only, _DO_NOT_USE_ ! ++ * Special printk facility for scheduler/timekeeping use only, _DO_NOT_USE_ ! + */ +-__printf(1, 2) __cold int printk_sched(const char *fmt, ...); ++__printf(1, 2) __cold int printk_deferred(const char *fmt, ...); + + /* + * Please don't use printk_ratelimit(), because it shares ratelimiting state +@@ -169,7 +169,7 @@ int printk(const char *s, ...) + return 0; + } + static inline __printf(1, 2) __cold +-int printk_sched(const char *s, ...) ++int printk_deferred(const char *s, ...) + { + return 0; + } +diff --git a/init/main.c b/init/main.c +index 9c7fd4c9249f..58c132d7de4b 100644 +--- a/init/main.c ++++ b/init/main.c +@@ -617,6 +617,10 @@ asmlinkage void __init start_kernel(void) + if (efi_enabled(EFI_RUNTIME_SERVICES)) + efi_enter_virtual_mode(); + #endif ++#ifdef CONFIG_X86_ESPFIX64 ++ /* Should be run before the first non-init thread is created */ ++ init_espfix_bsp(); ++#endif + thread_info_cache_init(); + cred_init(); + fork_init(totalram_pages); +diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c +index 4dae9cbe9259..8c086e6049b9 100644 +--- a/kernel/printk/printk.c ++++ b/kernel/printk/printk.c +@@ -2468,7 +2468,7 @@ void wake_up_klogd(void) + preempt_enable(); + } + +-int printk_sched(const char *fmt, ...) ++int printk_deferred(const char *fmt, ...) + { + unsigned long flags; + va_list args; +diff --git a/kernel/sched/core.c b/kernel/sched/core.c +index 0aae0fcec026..515e212421c0 100644 +--- a/kernel/sched/core.c ++++ b/kernel/sched/core.c +@@ -1322,7 +1322,7 @@ out: + * leave kernel. + */ + if (p->mm && printk_ratelimit()) { +- printk_sched("process %d (%s) no longer affine to cpu%d\n", ++ printk_deferred("process %d (%s) no longer affine to cpu%d\n", + task_pid_nr(p), p->comm, cpu); + } + } +diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c +index ce852643854b..37dac98c0749 100644 +--- a/kernel/sched/deadline.c ++++ b/kernel/sched/deadline.c +@@ -329,7 +329,7 @@ static void replenish_dl_entity(struct sched_dl_entity *dl_se, + + if (!lag_once) { + lag_once = true; +- printk_sched("sched: DL replenish lagged to much\n"); ++ printk_deferred("sched: DL replenish lagged to much\n"); + } + dl_se->deadline = rq_clock(rq) + pi_se->dl_deadline; + dl_se->runtime = pi_se->dl_runtime; +diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c +index 1999021042c7..27b8e836307f 100644 +--- a/kernel/sched/rt.c ++++ b/kernel/sched/rt.c +@@ -837,7 +837,7 @@ static int sched_rt_runtime_exceeded(struct rt_rq *rt_rq) + + if (!once) { + once = true; +- printk_sched("sched: RT throttling activated\n"); ++ printk_deferred("sched: RT throttling activated\n"); + } + } else { + /* +diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c +index 086ad6043bcb..60ba1af801c3 100644 +--- a/kernel/time/clockevents.c ++++ b/kernel/time/clockevents.c +@@ -146,7 +146,8 @@ static int clockevents_increase_min_delta(struct clock_event_device *dev) + { + /* Nothing to do if we already reached the limit */ + if (dev->min_delta_ns >= MIN_DELTA_LIMIT) { +- printk(KERN_WARNING "CE: Reprogramming failure. Giving up\n"); ++ printk_deferred(KERN_WARNING ++ "CE: Reprogramming failure. Giving up\n"); + dev->next_event.tv64 = KTIME_MAX; + return -ETIME; + } +@@ -159,9 +160,10 @@ static int clockevents_increase_min_delta(struct clock_event_device *dev) + if (dev->min_delta_ns > MIN_DELTA_LIMIT) + dev->min_delta_ns = MIN_DELTA_LIMIT; + +- printk(KERN_WARNING "CE: %s increased min_delta_ns to %llu nsec\n", +- dev->name ? dev->name : "?", +- (unsigned long long) dev->min_delta_ns); ++ printk_deferred(KERN_WARNING ++ "CE: %s increased min_delta_ns to %llu nsec\n", ++ dev->name ? dev->name : "?", ++ (unsigned long long) dev->min_delta_ns); + return 0; + } + +diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c +index 4d23dc4d8139..313a662911b1 100644 +--- a/kernel/time/sched_clock.c ++++ b/kernel/time/sched_clock.c +@@ -204,7 +204,8 @@ void __init sched_clock_postinit(void) + + static int sched_clock_suspend(void) + { +- sched_clock_poll(&sched_clock_timer); ++ update_sched_clock(); ++ hrtimer_cancel(&sched_clock_timer); + cd.suspended = true; + return 0; + } +@@ -212,6 +213,7 @@ static int sched_clock_suspend(void) + static void sched_clock_resume(void) + { + cd.epoch_cyc = read_sched_clock(); ++ hrtimer_start(&sched_clock_timer, cd.wrap_kt, HRTIMER_MODE_REL); + cd.suspended = false; + } + +diff --git a/lib/btree.c b/lib/btree.c +index f9a484676cb6..4264871ea1a0 100644 +--- a/lib/btree.c ++++ b/lib/btree.c +@@ -198,6 +198,7 @@ EXPORT_SYMBOL_GPL(btree_init); + + void btree_destroy(struct btree_head *head) + { ++ mempool_free(head->node, head->mempool); + mempool_destroy(head->mempool); + head->mempool = NULL; + } +diff --git a/mm/memcontrol.c b/mm/memcontrol.c +index 5b6b0039f725..9b35da28b587 100644 +--- a/mm/memcontrol.c ++++ b/mm/memcontrol.c +@@ -5670,8 +5670,12 @@ static int mem_cgroup_oom_notify_cb(struct mem_cgroup *memcg) + { + struct mem_cgroup_eventfd_list *ev; + ++ spin_lock(&memcg_oom_lock); ++ + list_for_each_entry(ev, &memcg->oom_notify, list) + eventfd_signal(ev->eventfd, 1); ++ ++ spin_unlock(&memcg_oom_lock); + return 0; + } + +diff --git a/mm/page-writeback.c b/mm/page-writeback.c +index d013dba21429..9f45f87a5859 100644 +--- a/mm/page-writeback.c ++++ b/mm/page-writeback.c +@@ -1324,9 +1324,9 @@ static inline void bdi_dirty_limits(struct backing_dev_info *bdi, + *bdi_thresh = bdi_dirty_limit(bdi, dirty_thresh); + + if (bdi_bg_thresh) +- *bdi_bg_thresh = div_u64((u64)*bdi_thresh * +- background_thresh, +- dirty_thresh); ++ *bdi_bg_thresh = dirty_thresh ? div_u64((u64)*bdi_thresh * ++ background_thresh, ++ dirty_thresh) : 0; + + /* + * In order to avoid the stacked BDI deadlock we need +diff --git a/mm/page_alloc.c b/mm/page_alloc.c +index 7e7f94755ab5..62e400d00e3f 100644 +--- a/mm/page_alloc.c ++++ b/mm/page_alloc.c +@@ -2434,7 +2434,7 @@ static inline int + gfp_to_alloc_flags(gfp_t gfp_mask) + { + int alloc_flags = ALLOC_WMARK_MIN | ALLOC_CPUSET; +- const gfp_t wait = gfp_mask & __GFP_WAIT; ++ const bool atomic = !(gfp_mask & (__GFP_WAIT | __GFP_NO_KSWAPD)); + + /* __GFP_HIGH is assumed to be the same as ALLOC_HIGH to save a branch. */ + BUILD_BUG_ON(__GFP_HIGH != (__force gfp_t) ALLOC_HIGH); +@@ -2443,20 +2443,20 @@ gfp_to_alloc_flags(gfp_t gfp_mask) + * The caller may dip into page reserves a bit more if the caller + * cannot run direct reclaim, or if the caller has realtime scheduling + * policy or is asking for __GFP_HIGH memory. GFP_ATOMIC requests will +- * set both ALLOC_HARDER (!wait) and ALLOC_HIGH (__GFP_HIGH). ++ * set both ALLOC_HARDER (atomic == true) and ALLOC_HIGH (__GFP_HIGH). + */ + alloc_flags |= (__force int) (gfp_mask & __GFP_HIGH); + +- if (!wait) { ++ if (atomic) { + /* +- * Not worth trying to allocate harder for +- * __GFP_NOMEMALLOC even if it can't schedule. ++ * Not worth trying to allocate harder for __GFP_NOMEMALLOC even ++ * if it can't schedule. + */ +- if (!(gfp_mask & __GFP_NOMEMALLOC)) ++ if (!(gfp_mask & __GFP_NOMEMALLOC)) + alloc_flags |= ALLOC_HARDER; + /* +- * Ignore cpuset if GFP_ATOMIC (!wait) rather than fail alloc. +- * See also cpuset_zone_allowed() comment in kernel/cpuset.c. ++ * Ignore cpuset mems for GFP_ATOMIC rather than fail, see the ++ * comment for __cpuset_node_allowed_softwall(). + */ + alloc_flags &= ~ALLOC_CPUSET; + } else if (unlikely(rt_task(current)) && !in_interrupt()) +diff --git a/net/l2tp/l2tp_ppp.c b/net/l2tp/l2tp_ppp.c +index ec6606325cda..1e05bbde47ba 100644 +--- a/net/l2tp/l2tp_ppp.c ++++ b/net/l2tp/l2tp_ppp.c +@@ -1368,7 +1368,7 @@ static int pppol2tp_setsockopt(struct socket *sock, int level, int optname, + int err; + + if (level != SOL_PPPOL2TP) +- return udp_prot.setsockopt(sk, level, optname, optval, optlen); ++ return -EINVAL; + + if (optlen < sizeof(int)) + return -EINVAL; +@@ -1494,7 +1494,7 @@ static int pppol2tp_getsockopt(struct socket *sock, int level, int optname, + struct pppol2tp_session *ps; + + if (level != SOL_PPPOL2TP) +- return udp_prot.getsockopt(sk, level, optname, optval, optlen); ++ return -EINVAL; + + if (get_user(len, optlen)) + return -EFAULT; +diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c +index c14c16a6d62d..e5a7ac2f3687 100644 +--- a/net/mac80211/tx.c ++++ b/net/mac80211/tx.c +@@ -414,6 +414,9 @@ ieee80211_tx_h_multicast_ps_buf(struct ieee80211_tx_data *tx) + if (ieee80211_has_order(hdr->frame_control)) + return TX_CONTINUE; + ++ if (ieee80211_is_probe_req(hdr->frame_control)) ++ return TX_CONTINUE; ++ + if (tx->local->hw.flags & IEEE80211_HW_QUEUE_CONTROL) + info->hw_queue = tx->sdata->vif.cab_queue; + +@@ -464,6 +467,7 @@ ieee80211_tx_h_unicast_ps_buf(struct ieee80211_tx_data *tx) + { + struct sta_info *sta = tx->sta; + struct ieee80211_tx_info *info = IEEE80211_SKB_CB(tx->skb); ++ struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)tx->skb->data; + struct ieee80211_local *local = tx->local; + + if (unlikely(!sta)) +@@ -474,6 +478,15 @@ ieee80211_tx_h_unicast_ps_buf(struct ieee80211_tx_data *tx) + !(info->flags & IEEE80211_TX_CTL_NO_PS_BUFFER))) { + int ac = skb_get_queue_mapping(tx->skb); + ++ /* only deauth, disassoc and action are bufferable MMPDUs */ ++ if (ieee80211_is_mgmt(hdr->frame_control) && ++ !ieee80211_is_deauth(hdr->frame_control) && ++ !ieee80211_is_disassoc(hdr->frame_control) && ++ !ieee80211_is_action(hdr->frame_control)) { ++ info->flags |= IEEE80211_TX_CTL_NO_PS_BUFFER; ++ return TX_CONTINUE; ++ } ++ + ps_dbg(sta->sdata, "STA %pM aid %d: PS buffer for AC %d\n", + sta->sta.addr, sta->sta.aid, ac); + if (tx->local->total_ps_buffered >= TOTAL_MAX_TX_BUFFER) +@@ -532,22 +545,8 @@ ieee80211_tx_h_unicast_ps_buf(struct ieee80211_tx_data *tx) + static ieee80211_tx_result debug_noinline + ieee80211_tx_h_ps_buf(struct ieee80211_tx_data *tx) + { +- struct ieee80211_tx_info *info = IEEE80211_SKB_CB(tx->skb); +- struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)tx->skb->data; +- + if (unlikely(tx->flags & IEEE80211_TX_PS_BUFFERED)) + return TX_CONTINUE; +- +- /* only deauth, disassoc and action are bufferable MMPDUs */ +- if (ieee80211_is_mgmt(hdr->frame_control) && +- !ieee80211_is_deauth(hdr->frame_control) && +- !ieee80211_is_disassoc(hdr->frame_control) && +- !ieee80211_is_action(hdr->frame_control)) { +- if (tx->flags & IEEE80211_TX_UNICAST) +- info->flags |= IEEE80211_TX_CTL_NO_PS_BUFFER; +- return TX_CONTINUE; +- } +- + if (tx->flags & IEEE80211_TX_UNICAST) + return ieee80211_tx_h_unicast_ps_buf(tx); + else +diff --git a/net/wireless/trace.h b/net/wireless/trace.h +index fbcc23edee54..b89eb3990f0a 100644 +--- a/net/wireless/trace.h ++++ b/net/wireless/trace.h +@@ -2068,7 +2068,8 @@ TRACE_EVENT(cfg80211_michael_mic_failure, + MAC_ASSIGN(addr, addr); + __entry->key_type = key_type; + __entry->key_id = key_id; +- memcpy(__entry->tsc, tsc, 6); ++ if (tsc) ++ memcpy(__entry->tsc, tsc, 6); + ), + TP_printk(NETDEV_PR_FMT ", " MAC_PR_FMT ", key type: %d, key id: %d, tsc: %pm", + NETDEV_PR_ARG, MAC_PR_ARG(addr), __entry->key_type, From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) by finch.gentoo.org (Postfix) with ESMTP id B7A9B138A1F for ; Fri, 8 Aug 2014 18:30:34 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 4E884E0959; Fri, 8 Aug 2014 18:30:33 +0000 (UTC) Received: from smtp.gentoo.org (smtp.gentoo.org [140.211.166.183]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id C7080E0959 for ; Fri, 8 Aug 2014 18:30:32 +0000 (UTC) Received: from spoonbill.gentoo.org (spoonbill.gentoo.org [81.93.255.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by smtp.gentoo.org (Postfix) with ESMTPS id B7FB2340169 for ; Fri, 8 Aug 2014 18:30:31 +0000 (UTC) Received: from localhost.localdomain (localhost [127.0.0.1]) by spoonbill.gentoo.org (Postfix) with ESMTP id 768EE18810 for ; Fri, 8 Aug 2014 18:30:30 +0000 (UTC) From: "Mike Pagano" To: gentoo-commits@lists.gentoo.org Content-Transfer-Encoding: 8bit Content-type: text/plain; charset=UTF-8 Reply-To: gentoo-dev@lists.gentoo.org, "Mike Pagano" Message-ID: <1407522621.3c7969e2225d5436ff5859c14e57e56af1868bb7.mpagano@gentoo> Subject: [gentoo-commits] proj/linux-patches:3.14 commit in: / X-VCS-Repository: proj/linux-patches X-VCS-Files: 0000_README 1015_linux-3.14.16.patch X-VCS-Directories: / X-VCS-Committer: mpagano X-VCS-Committer-Name: Mike Pagano X-VCS-Revision: 3c7969e2225d5436ff5859c14e57e56af1868bb7 X-VCS-Branch: 3.14 Date: Fri, 8 Aug 2014 18:30:30 +0000 (UTC) Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-commits@lists.gentoo.org X-Archives-Salt: fa8b2384-8bb2-4e10-8035-835429a09d98 X-Archives-Hash: 52a324ae8d513c1737853774bc3906d9 Message-ID: <20140808183030.-Qb0VkRG_9tzxxhIQyS_NxeJvgjzRYRL-xmL2xRl7PU@z> commit: 3c7969e2225d5436ff5859c14e57e56af1868bb7 Author: Mike Pagano gentoo org> AuthorDate: Fri Aug 8 18:30:21 2014 +0000 Commit: Mike Pagano gentoo org> CommitDate: Fri Aug 8 18:30:21 2014 +0000 URL: http://git.overlays.gentoo.org/gitweb/?p=proj/linux-patches.git;a=commit;h=3c7969e2 Linux patch 3.14.16 --- 0000_README | 4 + 1015_linux-3.14.16.patch | 1740 ++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 1744 insertions(+) diff --git a/0000_README b/0000_README index 70e968d..75c60df 100644 --- a/0000_README +++ b/0000_README @@ -102,6 +102,10 @@ Patch: 1014_linux-3.14.15.patch From: http://www.kernel.org Desc: Linux 3.14.15 +Patch: 1015_linux-3.14.16.patch +From: http://www.kernel.org +Desc: Linux 3.14.16 + Patch: 1500_XATTR_USER_PREFIX.patch From: https://bugs.gentoo.org/show_bug.cgi?id=470644 Desc: Support for namespace user.pax.* on tmpfs. diff --git a/1015_linux-3.14.16.patch b/1015_linux-3.14.16.patch new file mode 100644 index 0000000..346b103 --- /dev/null +++ b/1015_linux-3.14.16.patch @@ -0,0 +1,1740 @@ +diff --git a/Documentation/x86/x86_64/mm.txt b/Documentation/x86/x86_64/mm.txt +index c584a51add15..afe68ddbe6a4 100644 +--- a/Documentation/x86/x86_64/mm.txt ++++ b/Documentation/x86/x86_64/mm.txt +@@ -12,6 +12,8 @@ ffffc90000000000 - ffffe8ffffffffff (=45 bits) vmalloc/ioremap space + ffffe90000000000 - ffffe9ffffffffff (=40 bits) hole + ffffea0000000000 - ffffeaffffffffff (=40 bits) virtual memory map (1TB) + ... unused hole ... ++ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks ++... unused hole ... + ffffffff80000000 - ffffffffa0000000 (=512 MB) kernel text mapping, from phys 0 + ffffffffa0000000 - ffffffffff5fffff (=1525 MB) module mapping space + ffffffffff600000 - ffffffffffdfffff (=8 MB) vsyscalls +diff --git a/Makefile b/Makefile +index 188523e9e880..8b22e24a2d8e 100644 +--- a/Makefile ++++ b/Makefile +@@ -1,6 +1,6 @@ + VERSION = 3 + PATCHLEVEL = 14 +-SUBLEVEL = 15 ++SUBLEVEL = 16 + EXTRAVERSION = + NAME = Remembering Coco + +diff --git a/arch/arm/boot/dts/dra7-evm.dts b/arch/arm/boot/dts/dra7-evm.dts +index 5babba0a3a75..904dcf5973f3 100644 +--- a/arch/arm/boot/dts/dra7-evm.dts ++++ b/arch/arm/boot/dts/dra7-evm.dts +@@ -182,6 +182,7 @@ + regulator-name = "ldo3"; + regulator-min-microvolt = <1800000>; + regulator-max-microvolt = <1800000>; ++ regulator-always-on; + regulator-boot-on; + }; + +diff --git a/arch/arm/boot/dts/hi3620.dtsi b/arch/arm/boot/dts/hi3620.dtsi +index ab1116d086be..83a5b8685bd9 100644 +--- a/arch/arm/boot/dts/hi3620.dtsi ++++ b/arch/arm/boot/dts/hi3620.dtsi +@@ -73,7 +73,7 @@ + + L2: l2-cache { + compatible = "arm,pl310-cache"; +- reg = <0xfc10000 0x100000>; ++ reg = <0x100000 0x100000>; + interrupts = <0 15 4>; + cache-unified; + cache-level = <2>; +diff --git a/arch/arm/crypto/aesbs-glue.c b/arch/arm/crypto/aesbs-glue.c +index 4522366da759..15468fbbdea3 100644 +--- a/arch/arm/crypto/aesbs-glue.c ++++ b/arch/arm/crypto/aesbs-glue.c +@@ -137,7 +137,7 @@ static int aesbs_cbc_encrypt(struct blkcipher_desc *desc, + dst += AES_BLOCK_SIZE; + } while (--blocks); + } +- err = blkcipher_walk_done(desc, &walk, 0); ++ err = blkcipher_walk_done(desc, &walk, walk.nbytes % AES_BLOCK_SIZE); + } + return err; + } +@@ -158,7 +158,7 @@ static int aesbs_cbc_decrypt(struct blkcipher_desc *desc, + bsaes_cbc_encrypt(walk.src.virt.addr, walk.dst.virt.addr, + walk.nbytes, &ctx->dec, walk.iv); + kernel_neon_end(); +- err = blkcipher_walk_done(desc, &walk, 0); ++ err = blkcipher_walk_done(desc, &walk, walk.nbytes % AES_BLOCK_SIZE); + } + while (walk.nbytes) { + u32 blocks = walk.nbytes / AES_BLOCK_SIZE; +@@ -182,7 +182,7 @@ static int aesbs_cbc_decrypt(struct blkcipher_desc *desc, + dst += AES_BLOCK_SIZE; + src += AES_BLOCK_SIZE; + } while (--blocks); +- err = blkcipher_walk_done(desc, &walk, 0); ++ err = blkcipher_walk_done(desc, &walk, walk.nbytes % AES_BLOCK_SIZE); + } + return err; + } +@@ -268,7 +268,7 @@ static int aesbs_xts_encrypt(struct blkcipher_desc *desc, + bsaes_xts_encrypt(walk.src.virt.addr, walk.dst.virt.addr, + walk.nbytes, &ctx->enc, walk.iv); + kernel_neon_end(); +- err = blkcipher_walk_done(desc, &walk, 0); ++ err = blkcipher_walk_done(desc, &walk, walk.nbytes % AES_BLOCK_SIZE); + } + return err; + } +@@ -292,7 +292,7 @@ static int aesbs_xts_decrypt(struct blkcipher_desc *desc, + bsaes_xts_decrypt(walk.src.virt.addr, walk.dst.virt.addr, + walk.nbytes, &ctx->dec, walk.iv); + kernel_neon_end(); +- err = blkcipher_walk_done(desc, &walk, 0); ++ err = blkcipher_walk_done(desc, &walk, walk.nbytes % AES_BLOCK_SIZE); + } + return err; + } +diff --git a/arch/arm/mm/idmap.c b/arch/arm/mm/idmap.c +index 8e0e52eb76b5..d7a0ee898d24 100644 +--- a/arch/arm/mm/idmap.c ++++ b/arch/arm/mm/idmap.c +@@ -25,6 +25,13 @@ static void idmap_add_pmd(pud_t *pud, unsigned long addr, unsigned long end, + pr_warning("Failed to allocate identity pmd.\n"); + return; + } ++ /* ++ * Copy the original PMD to ensure that the PMD entries for ++ * the kernel image are preserved. ++ */ ++ if (!pud_none(*pud)) ++ memcpy(pmd, pmd_offset(pud, 0), ++ PTRS_PER_PMD * sizeof(pmd_t)); + pud_populate(&init_mm, pud, pmd); + pmd += pmd_index(addr); + } else +diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c +index b68c6b22e1c8..f15c22e8bcd5 100644 +--- a/arch/arm/mm/mmu.c ++++ b/arch/arm/mm/mmu.c +@@ -1436,8 +1436,8 @@ void __init early_paging_init(const struct machine_desc *mdesc, + return; + + /* remap kernel code and data */ +- map_start = init_mm.start_code; +- map_end = init_mm.brk; ++ map_start = init_mm.start_code & PMD_MASK; ++ map_end = ALIGN(init_mm.brk, PMD_SIZE); + + /* get a handle on things... */ + pgd0 = pgd_offset_k(0); +@@ -1472,7 +1472,7 @@ void __init early_paging_init(const struct machine_desc *mdesc, + } + + /* remap pmds for kernel mapping */ +- phys = __pa(map_start) & PMD_MASK; ++ phys = __pa(map_start); + do { + *pmdk++ = __pmd(phys | pmdprot); + phys += PMD_SIZE; +diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig +index 7324107acb40..c718d9f25900 100644 +--- a/arch/x86/Kconfig ++++ b/arch/x86/Kconfig +@@ -966,10 +966,27 @@ config VM86 + default y + depends on X86_32 + ---help--- +- This option is required by programs like DOSEMU to run 16-bit legacy +- code on X86 processors. It also may be needed by software like +- XFree86 to initialize some video cards via BIOS. Disabling this +- option saves about 6k. ++ This option is required by programs like DOSEMU to run ++ 16-bit real mode legacy code on x86 processors. It also may ++ be needed by software like XFree86 to initialize some video ++ cards via BIOS. Disabling this option saves about 6K. ++ ++config X86_16BIT ++ bool "Enable support for 16-bit segments" if EXPERT ++ default y ++ ---help--- ++ This option is required by programs like Wine to run 16-bit ++ protected mode legacy code on x86 processors. Disabling ++ this option saves about 300 bytes on i386, or around 6K text ++ plus 16K runtime memory on x86-64, ++ ++config X86_ESPFIX32 ++ def_bool y ++ depends on X86_16BIT && X86_32 ++ ++config X86_ESPFIX64 ++ def_bool y ++ depends on X86_16BIT && X86_64 + + config TOSHIBA + tristate "Toshiba Laptop support" +diff --git a/arch/x86/include/asm/espfix.h b/arch/x86/include/asm/espfix.h +new file mode 100644 +index 000000000000..99efebb2f69d +--- /dev/null ++++ b/arch/x86/include/asm/espfix.h +@@ -0,0 +1,16 @@ ++#ifndef _ASM_X86_ESPFIX_H ++#define _ASM_X86_ESPFIX_H ++ ++#ifdef CONFIG_X86_64 ++ ++#include ++ ++DECLARE_PER_CPU_READ_MOSTLY(unsigned long, espfix_stack); ++DECLARE_PER_CPU_READ_MOSTLY(unsigned long, espfix_waddr); ++ ++extern void init_espfix_bsp(void); ++extern void init_espfix_ap(void); ++ ++#endif /* CONFIG_X86_64 */ ++ ++#endif /* _ASM_X86_ESPFIX_H */ +diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h +index bba3cf88e624..0a8b519226b8 100644 +--- a/arch/x86/include/asm/irqflags.h ++++ b/arch/x86/include/asm/irqflags.h +@@ -129,7 +129,7 @@ static inline notrace unsigned long arch_local_irq_save(void) + + #define PARAVIRT_ADJUST_EXCEPTION_FRAME /* */ + +-#define INTERRUPT_RETURN iretq ++#define INTERRUPT_RETURN jmp native_iret + #define USERGS_SYSRET64 \ + swapgs; \ + sysretq; +diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h +index c883bf726398..7166e25ecb57 100644 +--- a/arch/x86/include/asm/pgtable_64_types.h ++++ b/arch/x86/include/asm/pgtable_64_types.h +@@ -61,6 +61,8 @@ typedef struct { pteval_t pte; } pte_t; + #define MODULES_VADDR (__START_KERNEL_map + KERNEL_IMAGE_SIZE) + #define MODULES_END _AC(0xffffffffff000000, UL) + #define MODULES_LEN (MODULES_END - MODULES_VADDR) ++#define ESPFIX_PGD_ENTRY _AC(-2, UL) ++#define ESPFIX_BASE_ADDR (ESPFIX_PGD_ENTRY << PGDIR_SHIFT) + + #define EARLY_DYNAMIC_PAGE_TABLES 64 + +diff --git a/arch/x86/include/asm/setup.h b/arch/x86/include/asm/setup.h +index d62c9f809bc5..75b14ca135be 100644 +--- a/arch/x86/include/asm/setup.h ++++ b/arch/x86/include/asm/setup.h +@@ -65,6 +65,8 @@ static inline void x86_ce4100_early_setup(void) { } + + #ifndef _SETUP + ++#include ++ + /* + * This is set up by the setup-routine at boot-time + */ +diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile +index cb648c84b327..56bac868cb91 100644 +--- a/arch/x86/kernel/Makefile ++++ b/arch/x86/kernel/Makefile +@@ -29,6 +29,7 @@ obj-$(CONFIG_X86_64) += sys_x86_64.o x8664_ksyms_64.o + obj-y += syscall_$(BITS).o + obj-$(CONFIG_X86_64) += vsyscall_64.o + obj-$(CONFIG_X86_64) += vsyscall_emu_64.o ++obj-$(CONFIG_X86_ESPFIX64) += espfix_64.o + obj-$(CONFIG_SYSFS) += ksysfs.o + obj-y += bootflag.o e820.o + obj-y += pci-dma.o quirks.o topology.o kdebugfs.o +diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S +index c87810b1b557..c5a9cb94dee6 100644 +--- a/arch/x86/kernel/entry_32.S ++++ b/arch/x86/kernel/entry_32.S +@@ -529,6 +529,7 @@ syscall_exit: + restore_all: + TRACE_IRQS_IRET + restore_all_notrace: ++#ifdef CONFIG_X86_ESPFIX32 + movl PT_EFLAGS(%esp), %eax # mix EFLAGS, SS and CS + # Warning: PT_OLDSS(%esp) contains the wrong/random values if we + # are returning to the kernel. +@@ -539,6 +540,7 @@ restore_all_notrace: + cmpl $((SEGMENT_LDT << 8) | USER_RPL), %eax + CFI_REMEMBER_STATE + je ldt_ss # returning to user-space with LDT SS ++#endif + restore_nocheck: + RESTORE_REGS 4 # skip orig_eax/error_code + irq_return: +@@ -551,6 +553,7 @@ ENTRY(iret_exc) + .previous + _ASM_EXTABLE(irq_return,iret_exc) + ++#ifdef CONFIG_X86_ESPFIX32 + CFI_RESTORE_STATE + ldt_ss: + #ifdef CONFIG_PARAVIRT +@@ -594,6 +597,7 @@ ldt_ss: + lss (%esp), %esp /* switch to espfix segment */ + CFI_ADJUST_CFA_OFFSET -8 + jmp restore_nocheck ++#endif + CFI_ENDPROC + ENDPROC(system_call) + +@@ -706,6 +710,7 @@ END(syscall_badsys) + * the high word of the segment base from the GDT and swiches to the + * normal stack and adjusts ESP with the matching offset. + */ ++#ifdef CONFIG_X86_ESPFIX32 + /* fixup the stack */ + mov GDT_ESPFIX_SS + 4, %al /* bits 16..23 */ + mov GDT_ESPFIX_SS + 7, %ah /* bits 24..31 */ +@@ -715,8 +720,10 @@ END(syscall_badsys) + pushl_cfi %eax + lss (%esp), %esp /* switch to the normal stack segment */ + CFI_ADJUST_CFA_OFFSET -8 ++#endif + .endm + .macro UNWIND_ESPFIX_STACK ++#ifdef CONFIG_X86_ESPFIX32 + movl %ss, %eax + /* see if on espfix stack */ + cmpw $__ESPFIX_SS, %ax +@@ -727,6 +734,7 @@ END(syscall_badsys) + /* switch to normal stack */ + FIXUP_ESPFIX_STACK + 27: ++#endif + .endm + + /* +@@ -1357,11 +1365,13 @@ END(debug) + ENTRY(nmi) + RING0_INT_FRAME + ASM_CLAC ++#ifdef CONFIG_X86_ESPFIX32 + pushl_cfi %eax + movl %ss, %eax + cmpw $__ESPFIX_SS, %ax + popl_cfi %eax + je nmi_espfix_stack ++#endif + cmpl $ia32_sysenter_target,(%esp) + je nmi_stack_fixup + pushl_cfi %eax +@@ -1401,6 +1411,7 @@ nmi_debug_stack_check: + FIX_STACK 24, nmi_stack_correct, 1 + jmp nmi_stack_correct + ++#ifdef CONFIG_X86_ESPFIX32 + nmi_espfix_stack: + /* We have a RING0_INT_FRAME here. + * +@@ -1422,6 +1433,7 @@ nmi_espfix_stack: + lss 12+4(%esp), %esp # back to espfix stack + CFI_ADJUST_CFA_OFFSET -24 + jmp irq_return ++#endif + CFI_ENDPROC + END(nmi) + +diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S +index 1e96c3628bf2..03cd2a8f6009 100644 +--- a/arch/x86/kernel/entry_64.S ++++ b/arch/x86/kernel/entry_64.S +@@ -58,6 +58,7 @@ + #include + #include + #include ++#include + #include + + /* Avoid __ASSEMBLER__'ifying just for this. */ +@@ -1041,12 +1042,45 @@ restore_args: + + irq_return: + INTERRUPT_RETURN +- _ASM_EXTABLE(irq_return, bad_iret) + +-#ifdef CONFIG_PARAVIRT + ENTRY(native_iret) ++ /* ++ * Are we returning to a stack segment from the LDT? Note: in ++ * 64-bit mode SS:RSP on the exception stack is always valid. ++ */ ++#ifdef CONFIG_X86_ESPFIX64 ++ testb $4,(SS-RIP)(%rsp) ++ jnz native_irq_return_ldt ++#endif ++ ++native_irq_return_iret: + iretq +- _ASM_EXTABLE(native_iret, bad_iret) ++ _ASM_EXTABLE(native_irq_return_iret, bad_iret) ++ ++#ifdef CONFIG_X86_ESPFIX64 ++native_irq_return_ldt: ++ pushq_cfi %rax ++ pushq_cfi %rdi ++ SWAPGS ++ movq PER_CPU_VAR(espfix_waddr),%rdi ++ movq %rax,(0*8)(%rdi) /* RAX */ ++ movq (2*8)(%rsp),%rax /* RIP */ ++ movq %rax,(1*8)(%rdi) ++ movq (3*8)(%rsp),%rax /* CS */ ++ movq %rax,(2*8)(%rdi) ++ movq (4*8)(%rsp),%rax /* RFLAGS */ ++ movq %rax,(3*8)(%rdi) ++ movq (6*8)(%rsp),%rax /* SS */ ++ movq %rax,(5*8)(%rdi) ++ movq (5*8)(%rsp),%rax /* RSP */ ++ movq %rax,(4*8)(%rdi) ++ andl $0xffff0000,%eax ++ popq_cfi %rdi ++ orq PER_CPU_VAR(espfix_stack),%rax ++ SWAPGS ++ movq %rax,%rsp ++ popq_cfi %rax ++ jmp native_irq_return_iret + #endif + + .section .fixup,"ax" +@@ -1110,9 +1144,40 @@ ENTRY(retint_kernel) + call preempt_schedule_irq + jmp exit_intr + #endif +- + CFI_ENDPROC + END(common_interrupt) ++ ++ /* ++ * If IRET takes a fault on the espfix stack, then we ++ * end up promoting it to a doublefault. In that case, ++ * modify the stack to make it look like we just entered ++ * the #GP handler from user space, similar to bad_iret. ++ */ ++#ifdef CONFIG_X86_ESPFIX64 ++ ALIGN ++__do_double_fault: ++ XCPT_FRAME 1 RDI+8 ++ movq RSP(%rdi),%rax /* Trap on the espfix stack? */ ++ sarq $PGDIR_SHIFT,%rax ++ cmpl $ESPFIX_PGD_ENTRY,%eax ++ jne do_double_fault /* No, just deliver the fault */ ++ cmpl $__KERNEL_CS,CS(%rdi) ++ jne do_double_fault ++ movq RIP(%rdi),%rax ++ cmpq $native_irq_return_iret,%rax ++ jne do_double_fault /* This shouldn't happen... */ ++ movq PER_CPU_VAR(kernel_stack),%rax ++ subq $(6*8-KERNEL_STACK_OFFSET),%rax /* Reset to original stack */ ++ movq %rax,RSP(%rdi) ++ movq $0,(%rax) /* Missing (lost) #GP error code */ ++ movq $general_protection,RIP(%rdi) ++ retq ++ CFI_ENDPROC ++END(__do_double_fault) ++#else ++# define __do_double_fault do_double_fault ++#endif ++ + /* + * End of kprobes section + */ +@@ -1314,7 +1379,7 @@ zeroentry overflow do_overflow + zeroentry bounds do_bounds + zeroentry invalid_op do_invalid_op + zeroentry device_not_available do_device_not_available +-paranoiderrorentry double_fault do_double_fault ++paranoiderrorentry double_fault __do_double_fault + zeroentry coprocessor_segment_overrun do_coprocessor_segment_overrun + errorentry invalid_TSS do_invalid_TSS + errorentry segment_not_present do_segment_not_present +@@ -1601,7 +1666,7 @@ error_sti: + */ + error_kernelspace: + incl %ebx +- leaq irq_return(%rip),%rcx ++ leaq native_irq_return_iret(%rip),%rcx + cmpq %rcx,RIP+8(%rsp) + je error_swapgs + movl %ecx,%eax /* zero extend */ +diff --git a/arch/x86/kernel/espfix_64.c b/arch/x86/kernel/espfix_64.c +new file mode 100644 +index 000000000000..94d857fb1033 +--- /dev/null ++++ b/arch/x86/kernel/espfix_64.c +@@ -0,0 +1,208 @@ ++/* ----------------------------------------------------------------------- * ++ * ++ * Copyright 2014 Intel Corporation; author: H. Peter Anvin ++ * ++ * This program is free software; you can redistribute it and/or modify it ++ * under the terms and conditions of the GNU General Public License, ++ * version 2, as published by the Free Software Foundation. ++ * ++ * This program is distributed in the hope it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for ++ * more details. ++ * ++ * ----------------------------------------------------------------------- */ ++ ++/* ++ * The IRET instruction, when returning to a 16-bit segment, only ++ * restores the bottom 16 bits of the user space stack pointer. This ++ * causes some 16-bit software to break, but it also leaks kernel state ++ * to user space. ++ * ++ * This works around this by creating percpu "ministacks", each of which ++ * is mapped 2^16 times 64K apart. When we detect that the return SS is ++ * on the LDT, we copy the IRET frame to the ministack and use the ++ * relevant alias to return to userspace. The ministacks are mapped ++ * readonly, so if the IRET fault we promote #GP to #DF which is an IST ++ * vector and thus has its own stack; we then do the fixup in the #DF ++ * handler. ++ * ++ * This file sets up the ministacks and the related page tables. The ++ * actual ministack invocation is in entry_64.S. ++ */ ++ ++#include ++#include ++#include ++#include ++#include ++#include ++#include ++#include ++#include ++#include ++ ++/* ++ * Note: we only need 6*8 = 48 bytes for the espfix stack, but round ++ * it up to a cache line to avoid unnecessary sharing. ++ */ ++#define ESPFIX_STACK_SIZE (8*8UL) ++#define ESPFIX_STACKS_PER_PAGE (PAGE_SIZE/ESPFIX_STACK_SIZE) ++ ++/* There is address space for how many espfix pages? */ ++#define ESPFIX_PAGE_SPACE (1UL << (PGDIR_SHIFT-PAGE_SHIFT-16)) ++ ++#define ESPFIX_MAX_CPUS (ESPFIX_STACKS_PER_PAGE * ESPFIX_PAGE_SPACE) ++#if CONFIG_NR_CPUS > ESPFIX_MAX_CPUS ++# error "Need more than one PGD for the ESPFIX hack" ++#endif ++ ++#define PGALLOC_GFP (GFP_KERNEL | __GFP_NOTRACK | __GFP_REPEAT | __GFP_ZERO) ++ ++/* This contains the *bottom* address of the espfix stack */ ++DEFINE_PER_CPU_READ_MOSTLY(unsigned long, espfix_stack); ++DEFINE_PER_CPU_READ_MOSTLY(unsigned long, espfix_waddr); ++ ++/* Initialization mutex - should this be a spinlock? */ ++static DEFINE_MUTEX(espfix_init_mutex); ++ ++/* Page allocation bitmap - each page serves ESPFIX_STACKS_PER_PAGE CPUs */ ++#define ESPFIX_MAX_PAGES DIV_ROUND_UP(CONFIG_NR_CPUS, ESPFIX_STACKS_PER_PAGE) ++static void *espfix_pages[ESPFIX_MAX_PAGES]; ++ ++static __page_aligned_bss pud_t espfix_pud_page[PTRS_PER_PUD] ++ __aligned(PAGE_SIZE); ++ ++static unsigned int page_random, slot_random; ++ ++/* ++ * This returns the bottom address of the espfix stack for a specific CPU. ++ * The math allows for a non-power-of-two ESPFIX_STACK_SIZE, in which case ++ * we have to account for some amount of padding at the end of each page. ++ */ ++static inline unsigned long espfix_base_addr(unsigned int cpu) ++{ ++ unsigned long page, slot; ++ unsigned long addr; ++ ++ page = (cpu / ESPFIX_STACKS_PER_PAGE) ^ page_random; ++ slot = (cpu + slot_random) % ESPFIX_STACKS_PER_PAGE; ++ addr = (page << PAGE_SHIFT) + (slot * ESPFIX_STACK_SIZE); ++ addr = (addr & 0xffffUL) | ((addr & ~0xffffUL) << 16); ++ addr += ESPFIX_BASE_ADDR; ++ return addr; ++} ++ ++#define PTE_STRIDE (65536/PAGE_SIZE) ++#define ESPFIX_PTE_CLONES (PTRS_PER_PTE/PTE_STRIDE) ++#define ESPFIX_PMD_CLONES PTRS_PER_PMD ++#define ESPFIX_PUD_CLONES (65536/(ESPFIX_PTE_CLONES*ESPFIX_PMD_CLONES)) ++ ++#define PGTABLE_PROT ((_KERNPG_TABLE & ~_PAGE_RW) | _PAGE_NX) ++ ++static void init_espfix_random(void) ++{ ++ unsigned long rand; ++ ++ /* ++ * This is run before the entropy pools are initialized, ++ * but this is hopefully better than nothing. ++ */ ++ if (!arch_get_random_long(&rand)) { ++ /* The constant is an arbitrary large prime */ ++ rdtscll(rand); ++ rand *= 0xc345c6b72fd16123UL; ++ } ++ ++ slot_random = rand % ESPFIX_STACKS_PER_PAGE; ++ page_random = (rand / ESPFIX_STACKS_PER_PAGE) ++ & (ESPFIX_PAGE_SPACE - 1); ++} ++ ++void __init init_espfix_bsp(void) ++{ ++ pgd_t *pgd_p; ++ pteval_t ptemask; ++ ++ ptemask = __supported_pte_mask; ++ ++ /* Install the espfix pud into the kernel page directory */ ++ pgd_p = &init_level4_pgt[pgd_index(ESPFIX_BASE_ADDR)]; ++ pgd_populate(&init_mm, pgd_p, (pud_t *)espfix_pud_page); ++ ++ /* Randomize the locations */ ++ init_espfix_random(); ++ ++ /* The rest is the same as for any other processor */ ++ init_espfix_ap(); ++} ++ ++void init_espfix_ap(void) ++{ ++ unsigned int cpu, page; ++ unsigned long addr; ++ pud_t pud, *pud_p; ++ pmd_t pmd, *pmd_p; ++ pte_t pte, *pte_p; ++ int n; ++ void *stack_page; ++ pteval_t ptemask; ++ ++ /* We only have to do this once... */ ++ if (likely(this_cpu_read(espfix_stack))) ++ return; /* Already initialized */ ++ ++ cpu = smp_processor_id(); ++ addr = espfix_base_addr(cpu); ++ page = cpu/ESPFIX_STACKS_PER_PAGE; ++ ++ /* Did another CPU already set this up? */ ++ stack_page = ACCESS_ONCE(espfix_pages[page]); ++ if (likely(stack_page)) ++ goto done; ++ ++ mutex_lock(&espfix_init_mutex); ++ ++ /* Did we race on the lock? */ ++ stack_page = ACCESS_ONCE(espfix_pages[page]); ++ if (stack_page) ++ goto unlock_done; ++ ++ ptemask = __supported_pte_mask; ++ ++ pud_p = &espfix_pud_page[pud_index(addr)]; ++ pud = *pud_p; ++ if (!pud_present(pud)) { ++ pmd_p = (pmd_t *)__get_free_page(PGALLOC_GFP); ++ pud = __pud(__pa(pmd_p) | (PGTABLE_PROT & ptemask)); ++ paravirt_alloc_pmd(&init_mm, __pa(pmd_p) >> PAGE_SHIFT); ++ for (n = 0; n < ESPFIX_PUD_CLONES; n++) ++ set_pud(&pud_p[n], pud); ++ } ++ ++ pmd_p = pmd_offset(&pud, addr); ++ pmd = *pmd_p; ++ if (!pmd_present(pmd)) { ++ pte_p = (pte_t *)__get_free_page(PGALLOC_GFP); ++ pmd = __pmd(__pa(pte_p) | (PGTABLE_PROT & ptemask)); ++ paravirt_alloc_pte(&init_mm, __pa(pte_p) >> PAGE_SHIFT); ++ for (n = 0; n < ESPFIX_PMD_CLONES; n++) ++ set_pmd(&pmd_p[n], pmd); ++ } ++ ++ pte_p = pte_offset_kernel(&pmd, addr); ++ stack_page = (void *)__get_free_page(GFP_KERNEL); ++ pte = __pte(__pa(stack_page) | (__PAGE_KERNEL_RO & ptemask)); ++ for (n = 0; n < ESPFIX_PTE_CLONES; n++) ++ set_pte(&pte_p[n*PTE_STRIDE], pte); ++ ++ /* Job is done for this CPU and any CPU which shares this page */ ++ ACCESS_ONCE(espfix_pages[page]) = stack_page; ++ ++unlock_done: ++ mutex_unlock(&espfix_init_mutex); ++done: ++ this_cpu_write(espfix_stack, addr); ++ this_cpu_write(espfix_waddr, (unsigned long)stack_page ++ + (addr & ~PAGE_MASK)); ++} +diff --git a/arch/x86/kernel/ldt.c b/arch/x86/kernel/ldt.c +index dcbbaa165bde..c37886d759cc 100644 +--- a/arch/x86/kernel/ldt.c ++++ b/arch/x86/kernel/ldt.c +@@ -20,8 +20,6 @@ + #include + #include + +-int sysctl_ldt16 = 0; +- + #ifdef CONFIG_SMP + static void flush_ldt(void *current_mm) + { +@@ -231,16 +229,10 @@ static int write_ldt(void __user *ptr, unsigned long bytecount, int oldmode) + } + } + +- /* +- * On x86-64 we do not support 16-bit segments due to +- * IRET leaking the high bits of the kernel stack address. +- */ +-#ifdef CONFIG_X86_64 +- if (!ldt_info.seg_32bit && !sysctl_ldt16) { ++ if (!IS_ENABLED(CONFIG_X86_16BIT) && !ldt_info.seg_32bit) { + error = -EINVAL; + goto out_unlock; + } +-#endif + + fill_ldt(&ldt, &ldt_info); + if (oldmode) +diff --git a/arch/x86/kernel/paravirt_patch_64.c b/arch/x86/kernel/paravirt_patch_64.c +index 3f08f34f93eb..a1da6737ba5b 100644 +--- a/arch/x86/kernel/paravirt_patch_64.c ++++ b/arch/x86/kernel/paravirt_patch_64.c +@@ -6,7 +6,6 @@ DEF_NATIVE(pv_irq_ops, irq_disable, "cli"); + DEF_NATIVE(pv_irq_ops, irq_enable, "sti"); + DEF_NATIVE(pv_irq_ops, restore_fl, "pushq %rdi; popfq"); + DEF_NATIVE(pv_irq_ops, save_fl, "pushfq; popq %rax"); +-DEF_NATIVE(pv_cpu_ops, iret, "iretq"); + DEF_NATIVE(pv_mmu_ops, read_cr2, "movq %cr2, %rax"); + DEF_NATIVE(pv_mmu_ops, read_cr3, "movq %cr3, %rax"); + DEF_NATIVE(pv_mmu_ops, write_cr3, "movq %rdi, %cr3"); +@@ -50,7 +49,6 @@ unsigned native_patch(u8 type, u16 clobbers, void *ibuf, + PATCH_SITE(pv_irq_ops, save_fl); + PATCH_SITE(pv_irq_ops, irq_enable); + PATCH_SITE(pv_irq_ops, irq_disable); +- PATCH_SITE(pv_cpu_ops, iret); + PATCH_SITE(pv_cpu_ops, irq_enable_sysexit); + PATCH_SITE(pv_cpu_ops, usergs_sysret32); + PATCH_SITE(pv_cpu_ops, usergs_sysret64); +diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c +index a32da804252e..395be6d8bbde 100644 +--- a/arch/x86/kernel/smpboot.c ++++ b/arch/x86/kernel/smpboot.c +@@ -243,6 +243,13 @@ static void notrace start_secondary(void *unused) + check_tsc_sync_target(); + + /* ++ * Enable the espfix hack for this CPU ++ */ ++#ifdef CONFIG_X86_ESPFIX64 ++ init_espfix_ap(); ++#endif ++ ++ /* + * We need to hold vector_lock so there the set of online cpus + * does not change while we are assigning vectors to cpus. Holding + * this lock ensures we don't half assign or remove an irq from a cpu. +diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c +index 0002a3a33081..3620928631ce 100644 +--- a/arch/x86/mm/dump_pagetables.c ++++ b/arch/x86/mm/dump_pagetables.c +@@ -30,11 +30,13 @@ struct pg_state { + unsigned long start_address; + unsigned long current_address; + const struct addr_marker *marker; ++ unsigned long lines; + }; + + struct addr_marker { + unsigned long start_address; + const char *name; ++ unsigned long max_lines; + }; + + /* indices for address_markers; keep sync'd w/ address_markers below */ +@@ -45,6 +47,7 @@ enum address_markers_idx { + LOW_KERNEL_NR, + VMALLOC_START_NR, + VMEMMAP_START_NR, ++ ESPFIX_START_NR, + HIGH_KERNEL_NR, + MODULES_VADDR_NR, + MODULES_END_NR, +@@ -67,6 +70,7 @@ static struct addr_marker address_markers[] = { + { PAGE_OFFSET, "Low Kernel Mapping" }, + { VMALLOC_START, "vmalloc() Area" }, + { VMEMMAP_START, "Vmemmap" }, ++ { ESPFIX_BASE_ADDR, "ESPfix Area", 16 }, + { __START_KERNEL_map, "High Kernel Mapping" }, + { MODULES_VADDR, "Modules" }, + { MODULES_END, "End Modules" }, +@@ -163,7 +167,7 @@ static void note_page(struct seq_file *m, struct pg_state *st, + pgprot_t new_prot, int level) + { + pgprotval_t prot, cur; +- static const char units[] = "KMGTPE"; ++ static const char units[] = "BKMGTPE"; + + /* + * If we have a "break" in the series, we need to flush the state that +@@ -178,6 +182,7 @@ static void note_page(struct seq_file *m, struct pg_state *st, + st->current_prot = new_prot; + st->level = level; + st->marker = address_markers; ++ st->lines = 0; + seq_printf(m, "---[ %s ]---\n", st->marker->name); + } else if (prot != cur || level != st->level || + st->current_address >= st->marker[1].start_address) { +@@ -188,17 +193,21 @@ static void note_page(struct seq_file *m, struct pg_state *st, + /* + * Now print the actual finished series + */ +- seq_printf(m, "0x%0*lx-0x%0*lx ", +- width, st->start_address, +- width, st->current_address); +- +- delta = (st->current_address - st->start_address) >> 10; +- while (!(delta & 1023) && unit[1]) { +- delta >>= 10; +- unit++; ++ if (!st->marker->max_lines || ++ st->lines < st->marker->max_lines) { ++ seq_printf(m, "0x%0*lx-0x%0*lx ", ++ width, st->start_address, ++ width, st->current_address); ++ ++ delta = (st->current_address - st->start_address) >> 10; ++ while (!(delta & 1023) && unit[1]) { ++ delta >>= 10; ++ unit++; ++ } ++ seq_printf(m, "%9lu%c ", delta, *unit); ++ printk_prot(m, st->current_prot, st->level); + } +- seq_printf(m, "%9lu%c ", delta, *unit); +- printk_prot(m, st->current_prot, st->level); ++ st->lines++; + + /* + * We print markers for special areas of address space, +diff --git a/arch/x86/vdso/vdso32-setup.c b/arch/x86/vdso/vdso32-setup.c +index f1d633a43f8e..d6bfb876cfb0 100644 +--- a/arch/x86/vdso/vdso32-setup.c ++++ b/arch/x86/vdso/vdso32-setup.c +@@ -41,7 +41,6 @@ enum { + #ifdef CONFIG_X86_64 + #define vdso_enabled sysctl_vsyscall32 + #define arch_setup_additional_pages syscall32_setup_pages +-extern int sysctl_ldt16; + #endif + + /* +@@ -381,13 +380,6 @@ static struct ctl_table abi_table2[] = { + .mode = 0644, + .proc_handler = proc_dointvec + }, +- { +- .procname = "ldt16", +- .data = &sysctl_ldt16, +- .maxlen = sizeof(int), +- .mode = 0644, +- .proc_handler = proc_dointvec +- }, + {} + }; + +diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c +index 0982233b9b84..a6a72ce8630f 100644 +--- a/arch/x86/xen/setup.c ++++ b/arch/x86/xen/setup.c +@@ -574,13 +574,7 @@ void xen_enable_syscall(void) + } + #endif /* CONFIG_X86_64 */ + } +-void xen_enable_nmi(void) +-{ +-#ifdef CONFIG_X86_64 +- if (register_callback(CALLBACKTYPE_nmi, (char *)nmi)) +- BUG(); +-#endif +-} ++ + void __init xen_pvmmu_arch_setup(void) + { + HYPERVISOR_vm_assist(VMASST_CMD_enable, VMASST_TYPE_4gb_segments); +@@ -595,7 +589,6 @@ void __init xen_pvmmu_arch_setup(void) + + xen_enable_sysenter(); + xen_enable_syscall(); +- xen_enable_nmi(); + } + + /* This function is not called for HVM domains */ +diff --git a/arch/xtensa/kernel/vectors.S b/arch/xtensa/kernel/vectors.S +index f9e1ec346e35..8453e6e39895 100644 +--- a/arch/xtensa/kernel/vectors.S ++++ b/arch/xtensa/kernel/vectors.S +@@ -376,38 +376,42 @@ _DoubleExceptionVector_WindowOverflow: + beqz a2, 1f # if at start of vector, don't restore + + addi a0, a0, -128 +- bbsi a0, 8, 1f # don't restore except for overflow 8 and 12 +- bbsi a0, 7, 2f ++ bbsi.l a0, 8, 1f # don't restore except for overflow 8 and 12 ++ ++ /* ++ * This fixup handler is for the extremely unlikely case where the ++ * overflow handler's reference thru a0 gets a hardware TLB refill ++ * that bumps out the (distinct, aliasing) TLB entry that mapped its ++ * prior references thru a9/a13, and where our reference now thru ++ * a9/a13 gets a 2nd-level miss exception (not hardware TLB refill). ++ */ ++ movi a2, window_overflow_restore_a0_fixup ++ s32i a2, a3, EXC_TABLE_FIXUP ++ l32i a2, a3, EXC_TABLE_DOUBLE_SAVE ++ xsr a3, excsave1 ++ ++ bbsi.l a0, 7, 2f + + /* + * Restore a0 as saved by _WindowOverflow8(). +- * +- * FIXME: we really need a fixup handler for this L32E, +- * for the extremely unlikely case where the overflow handler's +- * reference thru a0 gets a hardware TLB refill that bumps out +- * the (distinct, aliasing) TLB entry that mapped its prior +- * references thru a9, and where our reference now thru a9 +- * gets a 2nd-level miss exception (not hardware TLB refill). + */ + +- l32e a2, a9, -16 +- wsr a2, depc # replace the saved a0 +- j 1f ++ l32e a0, a9, -16 ++ wsr a0, depc # replace the saved a0 ++ j 3f + + 2: + /* + * Restore a0 as saved by _WindowOverflow12(). +- * +- * FIXME: we really need a fixup handler for this L32E, +- * for the extremely unlikely case where the overflow handler's +- * reference thru a0 gets a hardware TLB refill that bumps out +- * the (distinct, aliasing) TLB entry that mapped its prior +- * references thru a13, and where our reference now thru a13 +- * gets a 2nd-level miss exception (not hardware TLB refill). + */ + +- l32e a2, a13, -16 +- wsr a2, depc # replace the saved a0 ++ l32e a0, a13, -16 ++ wsr a0, depc # replace the saved a0 ++3: ++ xsr a3, excsave1 ++ movi a0, 0 ++ s32i a0, a3, EXC_TABLE_FIXUP ++ s32i a2, a3, EXC_TABLE_DOUBLE_SAVE + 1: + /* + * Restore WindowBase while leaving all address registers restored. +@@ -449,6 +453,7 @@ _DoubleExceptionVector_WindowOverflow: + + s32i a0, a2, PT_DEPC + ++_DoubleExceptionVector_handle_exception: + addx4 a0, a0, a3 + l32i a0, a0, EXC_TABLE_FAST_USER + xsr a3, excsave1 +@@ -464,11 +469,120 @@ _DoubleExceptionVector_WindowOverflow: + rotw -3 + j 1b + +- .end literal_prefix + + ENDPROC(_DoubleExceptionVector) + + /* ++ * Fixup handler for TLB miss in double exception handler for window owerflow. ++ * We get here with windowbase set to the window that was being spilled and ++ * a0 trashed. a0 bit 7 determines if this is a call8 (bit clear) or call12 ++ * (bit set) window. ++ * ++ * We do the following here: ++ * - go to the original window retaining a0 value; ++ * - set up exception stack to return back to appropriate a0 restore code ++ * (we'll need to rotate window back and there's no place to save this ++ * information, use different return address for that); ++ * - handle the exception; ++ * - go to the window that was being spilled; ++ * - set up window_overflow_restore_a0_fixup as a fixup routine; ++ * - reload a0; ++ * - restore the original window; ++ * - reset the default fixup routine; ++ * - return to user. By the time we get to this fixup handler all information ++ * about the conditions of the original double exception that happened in ++ * the window overflow handler is lost, so we just return to userspace to ++ * retry overflow from start. ++ * ++ * a0: value of depc, original value in depc ++ * a2: trashed, original value in EXC_TABLE_DOUBLE_SAVE ++ * a3: exctable, original value in excsave1 ++ */ ++ ++ENTRY(window_overflow_restore_a0_fixup) ++ ++ rsr a0, ps ++ extui a0, a0, PS_OWB_SHIFT, PS_OWB_WIDTH ++ rsr a2, windowbase ++ sub a0, a2, a0 ++ extui a0, a0, 0, 3 ++ l32i a2, a3, EXC_TABLE_DOUBLE_SAVE ++ xsr a3, excsave1 ++ ++ _beqi a0, 1, .Lhandle_1 ++ _beqi a0, 3, .Lhandle_3 ++ ++ .macro overflow_fixup_handle_exception_pane n ++ ++ rsr a0, depc ++ rotw -\n ++ ++ xsr a3, excsave1 ++ wsr a2, depc ++ l32i a2, a3, EXC_TABLE_KSTK ++ s32i a0, a2, PT_AREG0 ++ ++ movi a0, .Lrestore_\n ++ s32i a0, a2, PT_DEPC ++ rsr a0, exccause ++ j _DoubleExceptionVector_handle_exception ++ ++ .endm ++ ++ overflow_fixup_handle_exception_pane 2 ++.Lhandle_1: ++ overflow_fixup_handle_exception_pane 1 ++.Lhandle_3: ++ overflow_fixup_handle_exception_pane 3 ++ ++ .macro overflow_fixup_restore_a0_pane n ++ ++ rotw \n ++ /* Need to preserve a0 value here to be able to handle exception ++ * that may occur on a0 reload from stack. It may occur because ++ * TLB miss handler may not be atomic and pointer to page table ++ * may be lost before we get here. There are no free registers, ++ * so we need to use EXC_TABLE_DOUBLE_SAVE area. ++ */ ++ xsr a3, excsave1 ++ s32i a2, a3, EXC_TABLE_DOUBLE_SAVE ++ movi a2, window_overflow_restore_a0_fixup ++ s32i a2, a3, EXC_TABLE_FIXUP ++ l32i a2, a3, EXC_TABLE_DOUBLE_SAVE ++ xsr a3, excsave1 ++ bbsi.l a0, 7, 1f ++ l32e a0, a9, -16 ++ j 2f ++1: ++ l32e a0, a13, -16 ++2: ++ rotw -\n ++ ++ .endm ++ ++.Lrestore_2: ++ overflow_fixup_restore_a0_pane 2 ++ ++.Lset_default_fixup: ++ xsr a3, excsave1 ++ s32i a2, a3, EXC_TABLE_DOUBLE_SAVE ++ movi a2, 0 ++ s32i a2, a3, EXC_TABLE_FIXUP ++ l32i a2, a3, EXC_TABLE_DOUBLE_SAVE ++ xsr a3, excsave1 ++ rfe ++ ++.Lrestore_1: ++ overflow_fixup_restore_a0_pane 1 ++ j .Lset_default_fixup ++.Lrestore_3: ++ overflow_fixup_restore_a0_pane 3 ++ j .Lset_default_fixup ++ ++ENDPROC(window_overflow_restore_a0_fixup) ++ ++ .end literal_prefix ++/* + * Debug interrupt vector + * + * There is not much space here, so simply jump to another handler. +diff --git a/arch/xtensa/kernel/vmlinux.lds.S b/arch/xtensa/kernel/vmlinux.lds.S +index ee32c0085dff..d16db6df86f8 100644 +--- a/arch/xtensa/kernel/vmlinux.lds.S ++++ b/arch/xtensa/kernel/vmlinux.lds.S +@@ -269,13 +269,13 @@ SECTIONS + .UserExceptionVector.literal) + SECTION_VECTOR (_DoubleExceptionVector_literal, + .DoubleExceptionVector.literal, +- DOUBLEEXC_VECTOR_VADDR - 16, ++ DOUBLEEXC_VECTOR_VADDR - 40, + SIZEOF(.UserExceptionVector.text), + .UserExceptionVector.text) + SECTION_VECTOR (_DoubleExceptionVector_text, + .DoubleExceptionVector.text, + DOUBLEEXC_VECTOR_VADDR, +- 32, ++ 40, + .DoubleExceptionVector.literal) + + . = (LOADADDR( .DoubleExceptionVector.text ) + SIZEOF( .DoubleExceptionVector.text ) + 3) & ~ 3; +diff --git a/crypto/af_alg.c b/crypto/af_alg.c +index 966f893711b3..6a3ad8011585 100644 +--- a/crypto/af_alg.c ++++ b/crypto/af_alg.c +@@ -21,6 +21,7 @@ + #include + #include + #include ++#include + + struct alg_type_list { + const struct af_alg_type *type; +@@ -243,6 +244,7 @@ int af_alg_accept(struct sock *sk, struct socket *newsock) + + sock_init_data(newsock, sk2); + sock_graft(sk2, newsock); ++ security_sk_clone(sk, sk2); + + err = type->accept(ask->private, sk2); + if (err) { +diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c +index 199b52b7c3e1..153f4b92cc05 100644 +--- a/drivers/cpufreq/cpufreq.c ++++ b/drivers/cpufreq/cpufreq.c +@@ -1089,10 +1089,12 @@ static int __cpufreq_add_dev(struct device *dev, struct subsys_interface *sif, + * the creation of a brand new one. So we need to perform this update + * by invoking update_policy_cpu(). + */ +- if (frozen && cpu != policy->cpu) ++ if (frozen && cpu != policy->cpu) { + update_policy_cpu(policy, cpu); +- else ++ WARN_ON(kobject_move(&policy->kobj, &dev->kobj)); ++ } else { + policy->cpu = cpu; ++ } + + policy->governor = CPUFREQ_DEFAULT_GOVERNOR; + cpumask_copy(policy->cpus, cpumask_of(cpu)); +diff --git a/drivers/iio/accel/bma180.c b/drivers/iio/accel/bma180.c +index bfec313492b3..fe83d04784c8 100644 +--- a/drivers/iio/accel/bma180.c ++++ b/drivers/iio/accel/bma180.c +@@ -68,13 +68,13 @@ + /* Defaults values */ + #define BMA180_DEF_PMODE 0 + #define BMA180_DEF_BW 20 +-#define BMA180_DEF_SCALE 250 ++#define BMA180_DEF_SCALE 2452 + + /* Available values for sysfs */ + #define BMA180_FLP_FREQ_AVAILABLE \ + "10 20 40 75 150 300" + #define BMA180_SCALE_AVAILABLE \ +- "0.000130 0.000190 0.000250 0.000380 0.000500 0.000990 0.001980" ++ "0.001275 0.001863 0.002452 0.003727 0.004903 0.009709 0.019417" + + struct bma180_data { + struct i2c_client *client; +@@ -94,7 +94,7 @@ enum bma180_axis { + }; + + static int bw_table[] = { 10, 20, 40, 75, 150, 300 }; /* Hz */ +-static int scale_table[] = { 130, 190, 250, 380, 500, 990, 1980 }; ++static int scale_table[] = { 1275, 1863, 2452, 3727, 4903, 9709, 19417 }; + + static int bma180_get_acc_reg(struct bma180_data *data, enum bma180_axis axis) + { +@@ -376,6 +376,8 @@ static int bma180_write_raw(struct iio_dev *indio_dev, + mutex_unlock(&data->mutex); + return ret; + case IIO_CHAN_INFO_LOW_PASS_FILTER_3DB_FREQUENCY: ++ if (val2) ++ return -EINVAL; + mutex_lock(&data->mutex); + ret = bma180_set_bw(data, val); + mutex_unlock(&data->mutex); +diff --git a/drivers/iio/industrialio-buffer.c b/drivers/iio/industrialio-buffer.c +index fe25042f056a..0f1d9b2ccdfa 100644 +--- a/drivers/iio/industrialio-buffer.c ++++ b/drivers/iio/industrialio-buffer.c +@@ -953,7 +953,7 @@ static int iio_buffer_update_demux(struct iio_dev *indio_dev, + + /* Now we have the two masks, work from least sig and build up sizes */ + for_each_set_bit(out_ind, +- indio_dev->active_scan_mask, ++ buffer->scan_mask, + indio_dev->masklength) { + in_ind = find_next_bit(indio_dev->active_scan_mask, + indio_dev->masklength, +diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c +index 66c5d130c8c2..0e722c103562 100644 +--- a/drivers/md/dm-bufio.c ++++ b/drivers/md/dm-bufio.c +@@ -1541,7 +1541,7 @@ struct dm_bufio_client *dm_bufio_client_create(struct block_device *bdev, unsign + BUG_ON(block_size < 1 << SECTOR_SHIFT || + (block_size & (block_size - 1))); + +- c = kmalloc(sizeof(*c), GFP_KERNEL); ++ c = kzalloc(sizeof(*c), GFP_KERNEL); + if (!c) { + r = -ENOMEM; + goto bad_client; +diff --git a/drivers/md/dm-cache-target.c b/drivers/md/dm-cache-target.c +index c0ad90d91252..735e939a846d 100644 +--- a/drivers/md/dm-cache-target.c ++++ b/drivers/md/dm-cache-target.c +@@ -231,7 +231,7 @@ struct cache { + /* + * cache_size entries, dirty if set + */ +- dm_cblock_t nr_dirty; ++ atomic_t nr_dirty; + unsigned long *dirty_bitset; + + /* +@@ -493,7 +493,7 @@ static bool is_dirty(struct cache *cache, dm_cblock_t b) + static void set_dirty(struct cache *cache, dm_oblock_t oblock, dm_cblock_t cblock) + { + if (!test_and_set_bit(from_cblock(cblock), cache->dirty_bitset)) { +- cache->nr_dirty = to_cblock(from_cblock(cache->nr_dirty) + 1); ++ atomic_inc(&cache->nr_dirty); + policy_set_dirty(cache->policy, oblock); + } + } +@@ -502,8 +502,7 @@ static void clear_dirty(struct cache *cache, dm_oblock_t oblock, dm_cblock_t cbl + { + if (test_and_clear_bit(from_cblock(cblock), cache->dirty_bitset)) { + policy_clear_dirty(cache->policy, oblock); +- cache->nr_dirty = to_cblock(from_cblock(cache->nr_dirty) - 1); +- if (!from_cblock(cache->nr_dirty)) ++ if (atomic_dec_return(&cache->nr_dirty) == 0) + dm_table_event(cache->ti->table); + } + } +@@ -2286,7 +2285,7 @@ static int cache_create(struct cache_args *ca, struct cache **result) + atomic_set(&cache->quiescing_ack, 0); + + r = -ENOMEM; +- cache->nr_dirty = 0; ++ atomic_set(&cache->nr_dirty, 0); + cache->dirty_bitset = alloc_bitset(from_cblock(cache->cache_size)); + if (!cache->dirty_bitset) { + *error = "could not allocate dirty bitset"; +@@ -2828,7 +2827,7 @@ static void cache_status(struct dm_target *ti, status_type_t type, + + residency = policy_residency(cache->policy); + +- DMEMIT("%u %llu/%llu %u %llu/%llu %u %u %u %u %u %u %llu ", ++ DMEMIT("%u %llu/%llu %u %llu/%llu %u %u %u %u %u %u %lu ", + (unsigned)(DM_CACHE_METADATA_BLOCK_SIZE >> SECTOR_SHIFT), + (unsigned long long)(nr_blocks_metadata - nr_free_blocks_metadata), + (unsigned long long)nr_blocks_metadata, +@@ -2841,7 +2840,7 @@ static void cache_status(struct dm_target *ti, status_type_t type, + (unsigned) atomic_read(&cache->stats.write_miss), + (unsigned) atomic_read(&cache->stats.demotion), + (unsigned) atomic_read(&cache->stats.promotion), +- (unsigned long long) from_cblock(cache->nr_dirty)); ++ (unsigned long) atomic_read(&cache->nr_dirty)); + + if (writethrough_mode(&cache->features)) + DMEMIT("1 writethrough "); +diff --git a/drivers/net/wireless/ath/ath9k/xmit.c b/drivers/net/wireless/ath/ath9k/xmit.c +index 0526ddff977d..0fe7674ad100 100644 +--- a/drivers/net/wireless/ath/ath9k/xmit.c ++++ b/drivers/net/wireless/ath/ath9k/xmit.c +@@ -890,6 +890,15 @@ ath_tx_get_tid_subframe(struct ath_softc *sc, struct ath_txq *txq, + + tx_info = IEEE80211_SKB_CB(skb); + tx_info->flags &= ~IEEE80211_TX_CTL_CLEAR_PS_FILT; ++ ++ /* ++ * No aggregation session is running, but there may be frames ++ * from a previous session or a failed attempt in the queue. ++ * Send them out as normal data frames ++ */ ++ if (!tid->active) ++ tx_info->flags &= ~IEEE80211_TX_CTL_AMPDU; ++ + if (!(tx_info->flags & IEEE80211_TX_CTL_AMPDU)) { + bf->bf_state.bf_type = 0; + return bf; +diff --git a/drivers/pnp/pnpacpi/core.c b/drivers/pnp/pnpacpi/core.c +index c31aa07b3ba5..da1c6cb1a41e 100644 +--- a/drivers/pnp/pnpacpi/core.c ++++ b/drivers/pnp/pnpacpi/core.c +@@ -339,8 +339,7 @@ static int __init acpi_pnp_match(struct device *dev, void *_pnp) + struct pnp_dev *pnp = _pnp; + + /* true means it matched */ +- return !acpi->physical_node_count +- && compare_pnp_id(pnp->id, acpi_device_hid(acpi)); ++ return pnp->data == acpi; + } + + static struct acpi_device * __init acpi_pnp_find_companion(struct device *dev) +diff --git a/drivers/rapidio/devices/tsi721_dma.c b/drivers/rapidio/devices/tsi721_dma.c +index 91245f5dbe81..47257b6eea84 100644 +--- a/drivers/rapidio/devices/tsi721_dma.c ++++ b/drivers/rapidio/devices/tsi721_dma.c +@@ -287,6 +287,12 @@ struct tsi721_tx_desc *tsi721_desc_get(struct tsi721_bdma_chan *bdma_chan) + "desc %p not ACKed\n", tx_desc); + } + ++ if (ret == NULL) { ++ dev_dbg(bdma_chan->dchan.device->dev, ++ "%s: unable to obtain tx descriptor\n", __func__); ++ goto err_out; ++ } ++ + i = bdma_chan->wr_count_next % bdma_chan->bd_num; + if (i == bdma_chan->bd_num - 1) { + i = 0; +@@ -297,7 +303,7 @@ struct tsi721_tx_desc *tsi721_desc_get(struct tsi721_bdma_chan *bdma_chan) + tx_desc->txd.phys = bdma_chan->bd_phys + + i * sizeof(struct tsi721_dma_desc); + tx_desc->hw_desc = &((struct tsi721_dma_desc *)bdma_chan->bd_base)[i]; +- ++err_out: + spin_unlock_bh(&bdma_chan->lock); + + return ret; +diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c +index 62ec84b42e31..64e487a8bf59 100644 +--- a/drivers/scsi/scsi_lib.c ++++ b/drivers/scsi/scsi_lib.c +@@ -831,6 +831,14 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) + scsi_next_command(cmd); + return; + } ++ } else if (blk_rq_bytes(req) == 0 && result && !sense_deferred) { ++ /* ++ * Certain non BLOCK_PC requests are commands that don't ++ * actually transfer anything (FLUSH), so cannot use ++ * good_bytes != blk_rq_bytes(req) as the signal for an error. ++ * This sets the error explicitly for the problem case. ++ */ ++ error = __scsi_error_from_host_byte(cmd, result); + } + + /* no bidi support for !REQ_TYPE_BLOCK_PC yet */ +diff --git a/drivers/staging/vt6655/bssdb.c b/drivers/staging/vt6655/bssdb.c +index d7efd0173a9a..7d7578872a84 100644 +--- a/drivers/staging/vt6655/bssdb.c ++++ b/drivers/staging/vt6655/bssdb.c +@@ -983,7 +983,7 @@ start: + pDevice->byERPFlag &= ~(WLAN_SET_ERP_USE_PROTECTION(1)); + } + +- { ++ if (pDevice->eCommandState == WLAN_ASSOCIATE_WAIT) { + pDevice->byReAssocCount++; + /* 10 sec timeout */ + if ((pDevice->byReAssocCount > 10) && (!pDevice->bLinkPass)) { +diff --git a/drivers/staging/vt6655/device_main.c b/drivers/staging/vt6655/device_main.c +index a952df1bf9d6..6f13f0e597f8 100644 +--- a/drivers/staging/vt6655/device_main.c ++++ b/drivers/staging/vt6655/device_main.c +@@ -2430,6 +2430,7 @@ static irqreturn_t device_intr(int irq, void *dev_instance) { + int handled = 0; + unsigned char byData = 0; + int ii = 0; ++ unsigned long flags; + // unsigned char byRSSI; + + MACvReadISR(pDevice->PortOffset, &pDevice->dwIsr); +@@ -2455,7 +2456,8 @@ static irqreturn_t device_intr(int irq, void *dev_instance) { + + handled = 1; + MACvIntDisable(pDevice->PortOffset); +- spin_lock_irq(&pDevice->lock); ++ ++ spin_lock_irqsave(&pDevice->lock, flags); + + //Make sure current page is 0 + VNSvInPortB(pDevice->PortOffset + MAC_REG_PAGE1SEL, &byOrgPageSel); +@@ -2696,7 +2698,8 @@ static irqreturn_t device_intr(int irq, void *dev_instance) { + MACvSelectPage1(pDevice->PortOffset); + } + +- spin_unlock_irq(&pDevice->lock); ++ spin_unlock_irqrestore(&pDevice->lock, flags); ++ + MACvIntEnable(pDevice->PortOffset, IMR_MASK_VALUE); + + return IRQ_RETVAL(handled); +diff --git a/include/dt-bindings/pinctrl/dra.h b/include/dt-bindings/pinctrl/dra.h +index 002a2855c046..3d33794e4f3e 100644 +--- a/include/dt-bindings/pinctrl/dra.h ++++ b/include/dt-bindings/pinctrl/dra.h +@@ -30,7 +30,8 @@ + #define MUX_MODE14 0xe + #define MUX_MODE15 0xf + +-#define PULL_ENA (1 << 16) ++#define PULL_ENA (0 << 16) ++#define PULL_DIS (1 << 16) + #define PULL_UP (1 << 17) + #define INPUT_EN (1 << 18) + #define SLEWCONTROL (1 << 19) +@@ -38,10 +39,10 @@ + #define WAKEUP_EVENT (1 << 25) + + /* Active pin states */ +-#define PIN_OUTPUT 0 ++#define PIN_OUTPUT (0 | PULL_DIS) + #define PIN_OUTPUT_PULLUP (PIN_OUTPUT | PULL_ENA | PULL_UP) + #define PIN_OUTPUT_PULLDOWN (PIN_OUTPUT | PULL_ENA) +-#define PIN_INPUT INPUT_EN ++#define PIN_INPUT (INPUT_EN | PULL_DIS) + #define PIN_INPUT_SLEW (INPUT_EN | SLEWCONTROL) + #define PIN_INPUT_PULLUP (PULL_ENA | INPUT_EN | PULL_UP) + #define PIN_INPUT_PULLDOWN (PULL_ENA | INPUT_EN) +diff --git a/include/linux/printk.h b/include/linux/printk.h +index fa47e2708c01..cbf094f993f4 100644 +--- a/include/linux/printk.h ++++ b/include/linux/printk.h +@@ -132,9 +132,9 @@ asmlinkage __printf(1, 2) __cold + int printk(const char *fmt, ...); + + /* +- * Special printk facility for scheduler use only, _DO_NOT_USE_ ! ++ * Special printk facility for scheduler/timekeeping use only, _DO_NOT_USE_ ! + */ +-__printf(1, 2) __cold int printk_sched(const char *fmt, ...); ++__printf(1, 2) __cold int printk_deferred(const char *fmt, ...); + + /* + * Please don't use printk_ratelimit(), because it shares ratelimiting state +@@ -169,7 +169,7 @@ int printk(const char *s, ...) + return 0; + } + static inline __printf(1, 2) __cold +-int printk_sched(const char *s, ...) ++int printk_deferred(const char *s, ...) + { + return 0; + } +diff --git a/init/main.c b/init/main.c +index 9c7fd4c9249f..58c132d7de4b 100644 +--- a/init/main.c ++++ b/init/main.c +@@ -617,6 +617,10 @@ asmlinkage void __init start_kernel(void) + if (efi_enabled(EFI_RUNTIME_SERVICES)) + efi_enter_virtual_mode(); + #endif ++#ifdef CONFIG_X86_ESPFIX64 ++ /* Should be run before the first non-init thread is created */ ++ init_espfix_bsp(); ++#endif + thread_info_cache_init(); + cred_init(); + fork_init(totalram_pages); +diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c +index 4dae9cbe9259..8c086e6049b9 100644 +--- a/kernel/printk/printk.c ++++ b/kernel/printk/printk.c +@@ -2468,7 +2468,7 @@ void wake_up_klogd(void) + preempt_enable(); + } + +-int printk_sched(const char *fmt, ...) ++int printk_deferred(const char *fmt, ...) + { + unsigned long flags; + va_list args; +diff --git a/kernel/sched/core.c b/kernel/sched/core.c +index 0aae0fcec026..515e212421c0 100644 +--- a/kernel/sched/core.c ++++ b/kernel/sched/core.c +@@ -1322,7 +1322,7 @@ out: + * leave kernel. + */ + if (p->mm && printk_ratelimit()) { +- printk_sched("process %d (%s) no longer affine to cpu%d\n", ++ printk_deferred("process %d (%s) no longer affine to cpu%d\n", + task_pid_nr(p), p->comm, cpu); + } + } +diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c +index ce852643854b..37dac98c0749 100644 +--- a/kernel/sched/deadline.c ++++ b/kernel/sched/deadline.c +@@ -329,7 +329,7 @@ static void replenish_dl_entity(struct sched_dl_entity *dl_se, + + if (!lag_once) { + lag_once = true; +- printk_sched("sched: DL replenish lagged to much\n"); ++ printk_deferred("sched: DL replenish lagged to much\n"); + } + dl_se->deadline = rq_clock(rq) + pi_se->dl_deadline; + dl_se->runtime = pi_se->dl_runtime; +diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c +index 1999021042c7..27b8e836307f 100644 +--- a/kernel/sched/rt.c ++++ b/kernel/sched/rt.c +@@ -837,7 +837,7 @@ static int sched_rt_runtime_exceeded(struct rt_rq *rt_rq) + + if (!once) { + once = true; +- printk_sched("sched: RT throttling activated\n"); ++ printk_deferred("sched: RT throttling activated\n"); + } + } else { + /* +diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c +index 086ad6043bcb..60ba1af801c3 100644 +--- a/kernel/time/clockevents.c ++++ b/kernel/time/clockevents.c +@@ -146,7 +146,8 @@ static int clockevents_increase_min_delta(struct clock_event_device *dev) + { + /* Nothing to do if we already reached the limit */ + if (dev->min_delta_ns >= MIN_DELTA_LIMIT) { +- printk(KERN_WARNING "CE: Reprogramming failure. Giving up\n"); ++ printk_deferred(KERN_WARNING ++ "CE: Reprogramming failure. Giving up\n"); + dev->next_event.tv64 = KTIME_MAX; + return -ETIME; + } +@@ -159,9 +160,10 @@ static int clockevents_increase_min_delta(struct clock_event_device *dev) + if (dev->min_delta_ns > MIN_DELTA_LIMIT) + dev->min_delta_ns = MIN_DELTA_LIMIT; + +- printk(KERN_WARNING "CE: %s increased min_delta_ns to %llu nsec\n", +- dev->name ? dev->name : "?", +- (unsigned long long) dev->min_delta_ns); ++ printk_deferred(KERN_WARNING ++ "CE: %s increased min_delta_ns to %llu nsec\n", ++ dev->name ? dev->name : "?", ++ (unsigned long long) dev->min_delta_ns); + return 0; + } + +diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c +index 4d23dc4d8139..313a662911b1 100644 +--- a/kernel/time/sched_clock.c ++++ b/kernel/time/sched_clock.c +@@ -204,7 +204,8 @@ void __init sched_clock_postinit(void) + + static int sched_clock_suspend(void) + { +- sched_clock_poll(&sched_clock_timer); ++ update_sched_clock(); ++ hrtimer_cancel(&sched_clock_timer); + cd.suspended = true; + return 0; + } +@@ -212,6 +213,7 @@ static int sched_clock_suspend(void) + static void sched_clock_resume(void) + { + cd.epoch_cyc = read_sched_clock(); ++ hrtimer_start(&sched_clock_timer, cd.wrap_kt, HRTIMER_MODE_REL); + cd.suspended = false; + } + +diff --git a/lib/btree.c b/lib/btree.c +index f9a484676cb6..4264871ea1a0 100644 +--- a/lib/btree.c ++++ b/lib/btree.c +@@ -198,6 +198,7 @@ EXPORT_SYMBOL_GPL(btree_init); + + void btree_destroy(struct btree_head *head) + { ++ mempool_free(head->node, head->mempool); + mempool_destroy(head->mempool); + head->mempool = NULL; + } +diff --git a/mm/memcontrol.c b/mm/memcontrol.c +index 5b6b0039f725..9b35da28b587 100644 +--- a/mm/memcontrol.c ++++ b/mm/memcontrol.c +@@ -5670,8 +5670,12 @@ static int mem_cgroup_oom_notify_cb(struct mem_cgroup *memcg) + { + struct mem_cgroup_eventfd_list *ev; + ++ spin_lock(&memcg_oom_lock); ++ + list_for_each_entry(ev, &memcg->oom_notify, list) + eventfd_signal(ev->eventfd, 1); ++ ++ spin_unlock(&memcg_oom_lock); + return 0; + } + +diff --git a/mm/page-writeback.c b/mm/page-writeback.c +index d013dba21429..9f45f87a5859 100644 +--- a/mm/page-writeback.c ++++ b/mm/page-writeback.c +@@ -1324,9 +1324,9 @@ static inline void bdi_dirty_limits(struct backing_dev_info *bdi, + *bdi_thresh = bdi_dirty_limit(bdi, dirty_thresh); + + if (bdi_bg_thresh) +- *bdi_bg_thresh = div_u64((u64)*bdi_thresh * +- background_thresh, +- dirty_thresh); ++ *bdi_bg_thresh = dirty_thresh ? div_u64((u64)*bdi_thresh * ++ background_thresh, ++ dirty_thresh) : 0; + + /* + * In order to avoid the stacked BDI deadlock we need +diff --git a/mm/page_alloc.c b/mm/page_alloc.c +index 7e7f94755ab5..62e400d00e3f 100644 +--- a/mm/page_alloc.c ++++ b/mm/page_alloc.c +@@ -2434,7 +2434,7 @@ static inline int + gfp_to_alloc_flags(gfp_t gfp_mask) + { + int alloc_flags = ALLOC_WMARK_MIN | ALLOC_CPUSET; +- const gfp_t wait = gfp_mask & __GFP_WAIT; ++ const bool atomic = !(gfp_mask & (__GFP_WAIT | __GFP_NO_KSWAPD)); + + /* __GFP_HIGH is assumed to be the same as ALLOC_HIGH to save a branch. */ + BUILD_BUG_ON(__GFP_HIGH != (__force gfp_t) ALLOC_HIGH); +@@ -2443,20 +2443,20 @@ gfp_to_alloc_flags(gfp_t gfp_mask) + * The caller may dip into page reserves a bit more if the caller + * cannot run direct reclaim, or if the caller has realtime scheduling + * policy or is asking for __GFP_HIGH memory. GFP_ATOMIC requests will +- * set both ALLOC_HARDER (!wait) and ALLOC_HIGH (__GFP_HIGH). ++ * set both ALLOC_HARDER (atomic == true) and ALLOC_HIGH (__GFP_HIGH). + */ + alloc_flags |= (__force int) (gfp_mask & __GFP_HIGH); + +- if (!wait) { ++ if (atomic) { + /* +- * Not worth trying to allocate harder for +- * __GFP_NOMEMALLOC even if it can't schedule. ++ * Not worth trying to allocate harder for __GFP_NOMEMALLOC even ++ * if it can't schedule. + */ +- if (!(gfp_mask & __GFP_NOMEMALLOC)) ++ if (!(gfp_mask & __GFP_NOMEMALLOC)) + alloc_flags |= ALLOC_HARDER; + /* +- * Ignore cpuset if GFP_ATOMIC (!wait) rather than fail alloc. +- * See also cpuset_zone_allowed() comment in kernel/cpuset.c. ++ * Ignore cpuset mems for GFP_ATOMIC rather than fail, see the ++ * comment for __cpuset_node_allowed_softwall(). + */ + alloc_flags &= ~ALLOC_CPUSET; + } else if (unlikely(rt_task(current)) && !in_interrupt()) +diff --git a/net/l2tp/l2tp_ppp.c b/net/l2tp/l2tp_ppp.c +index ec6606325cda..1e05bbde47ba 100644 +--- a/net/l2tp/l2tp_ppp.c ++++ b/net/l2tp/l2tp_ppp.c +@@ -1368,7 +1368,7 @@ static int pppol2tp_setsockopt(struct socket *sock, int level, int optname, + int err; + + if (level != SOL_PPPOL2TP) +- return udp_prot.setsockopt(sk, level, optname, optval, optlen); ++ return -EINVAL; + + if (optlen < sizeof(int)) + return -EINVAL; +@@ -1494,7 +1494,7 @@ static int pppol2tp_getsockopt(struct socket *sock, int level, int optname, + struct pppol2tp_session *ps; + + if (level != SOL_PPPOL2TP) +- return udp_prot.getsockopt(sk, level, optname, optval, optlen); ++ return -EINVAL; + + if (get_user(len, optlen)) + return -EFAULT; +diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c +index c14c16a6d62d..e5a7ac2f3687 100644 +--- a/net/mac80211/tx.c ++++ b/net/mac80211/tx.c +@@ -414,6 +414,9 @@ ieee80211_tx_h_multicast_ps_buf(struct ieee80211_tx_data *tx) + if (ieee80211_has_order(hdr->frame_control)) + return TX_CONTINUE; + ++ if (ieee80211_is_probe_req(hdr->frame_control)) ++ return TX_CONTINUE; ++ + if (tx->local->hw.flags & IEEE80211_HW_QUEUE_CONTROL) + info->hw_queue = tx->sdata->vif.cab_queue; + +@@ -464,6 +467,7 @@ ieee80211_tx_h_unicast_ps_buf(struct ieee80211_tx_data *tx) + { + struct sta_info *sta = tx->sta; + struct ieee80211_tx_info *info = IEEE80211_SKB_CB(tx->skb); ++ struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)tx->skb->data; + struct ieee80211_local *local = tx->local; + + if (unlikely(!sta)) +@@ -474,6 +478,15 @@ ieee80211_tx_h_unicast_ps_buf(struct ieee80211_tx_data *tx) + !(info->flags & IEEE80211_TX_CTL_NO_PS_BUFFER))) { + int ac = skb_get_queue_mapping(tx->skb); + ++ /* only deauth, disassoc and action are bufferable MMPDUs */ ++ if (ieee80211_is_mgmt(hdr->frame_control) && ++ !ieee80211_is_deauth(hdr->frame_control) && ++ !ieee80211_is_disassoc(hdr->frame_control) && ++ !ieee80211_is_action(hdr->frame_control)) { ++ info->flags |= IEEE80211_TX_CTL_NO_PS_BUFFER; ++ return TX_CONTINUE; ++ } ++ + ps_dbg(sta->sdata, "STA %pM aid %d: PS buffer for AC %d\n", + sta->sta.addr, sta->sta.aid, ac); + if (tx->local->total_ps_buffered >= TOTAL_MAX_TX_BUFFER) +@@ -532,22 +545,8 @@ ieee80211_tx_h_unicast_ps_buf(struct ieee80211_tx_data *tx) + static ieee80211_tx_result debug_noinline + ieee80211_tx_h_ps_buf(struct ieee80211_tx_data *tx) + { +- struct ieee80211_tx_info *info = IEEE80211_SKB_CB(tx->skb); +- struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)tx->skb->data; +- + if (unlikely(tx->flags & IEEE80211_TX_PS_BUFFERED)) + return TX_CONTINUE; +- +- /* only deauth, disassoc and action are bufferable MMPDUs */ +- if (ieee80211_is_mgmt(hdr->frame_control) && +- !ieee80211_is_deauth(hdr->frame_control) && +- !ieee80211_is_disassoc(hdr->frame_control) && +- !ieee80211_is_action(hdr->frame_control)) { +- if (tx->flags & IEEE80211_TX_UNICAST) +- info->flags |= IEEE80211_TX_CTL_NO_PS_BUFFER; +- return TX_CONTINUE; +- } +- + if (tx->flags & IEEE80211_TX_UNICAST) + return ieee80211_tx_h_unicast_ps_buf(tx); + else +diff --git a/net/wireless/trace.h b/net/wireless/trace.h +index fbcc23edee54..b89eb3990f0a 100644 +--- a/net/wireless/trace.h ++++ b/net/wireless/trace.h +@@ -2068,7 +2068,8 @@ TRACE_EVENT(cfg80211_michael_mic_failure, + MAC_ASSIGN(addr, addr); + __entry->key_type = key_type; + __entry->key_id = key_id; +- memcpy(__entry->tsc, tsc, 6); ++ if (tsc) ++ memcpy(__entry->tsc, tsc, 6); + ), + TP_printk(NETDEV_PR_FMT ", " MAC_PR_FMT ", key type: %d, key id: %d, tsc: %pm", + NETDEV_PR_ARG, MAC_PR_ARG(addr), __entry->key_type,