From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gentoo-commits+bounces-992871-garchives=archives.gentoo.org@lists.gentoo.org>
Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by finch.gentoo.org (Postfix) with ESMTPS id 6F3D4138806
	for <garchives@archives.gentoo.org>; Mon, 25 Dec 2017 14:35:02 +0000 (UTC)
Received: from pigeon.gentoo.org (localhost [127.0.0.1])
	by pigeon.gentoo.org (Postfix) with SMTP id 41E49E0CCD;
	Mon, 25 Dec 2017 14:34:59 +0000 (UTC)
Received: from smtp.gentoo.org (smtp.gentoo.org [140.211.166.183])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	by pigeon.gentoo.org (Postfix) with ESMTPS id E546BE0CCD
	for <gentoo-commits@lists.gentoo.org>; Mon, 25 Dec 2017 14:34:58 +0000 (UTC)
Received: from oystercatcher.gentoo.org (oystercatcher.gentoo.org [148.251.78.52])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	by smtp.gentoo.org (Postfix) with ESMTPS id 15A0C33C770
	for <gentoo-commits@lists.gentoo.org>; Mon, 25 Dec 2017 14:34:55 +0000 (UTC)
Received: from localhost.localdomain (localhost [IPv6:::1])
	by oystercatcher.gentoo.org (Postfix) with ESMTP id A0484AE8F
	for <gentoo-commits@lists.gentoo.org>; Mon, 25 Dec 2017 14:34:53 +0000 (UTC)
From: "Alice Ferrazzi" <alicef@gentoo.org>
To: gentoo-commits@lists.gentoo.org
Content-Transfer-Encoding: 8bit
Content-type: text/plain; charset=UTF-8
Reply-To: gentoo-dev@lists.gentoo.org, "Alice Ferrazzi" <alicef@gentoo.org>
Message-ID: <1514212079.5a2a504d74eaf43aa0b3bf953836a44e0bb40b28.alicef@gentoo>
Subject: [gentoo-commits] proj/linux-patches:4.14 commit in: /
X-VCS-Repository: proj/linux-patches
X-VCS-Files: 0000_README 1008_linux-4.14.9.patch
X-VCS-Directories: /
X-VCS-Committer: alicef
X-VCS-Committer-Name: Alice Ferrazzi
X-VCS-Revision: 5a2a504d74eaf43aa0b3bf953836a44e0bb40b28
X-VCS-Branch: 4.14
Date: Mon, 25 Dec 2017 14:34:53 +0000 (UTC)
Precedence: bulk
List-Post: <mailto:gentoo-commits@lists.gentoo.org>
List-Help: <mailto:gentoo-commits+help@lists.gentoo.org>
List-Unsubscribe: <mailto:gentoo-commits+unsubscribe@lists.gentoo.org>
List-Subscribe: <mailto:gentoo-commits+subscribe@lists.gentoo.org>
List-Id: Gentoo Linux mail <gentoo-commits.gentoo.org>
X-BeenThere: gentoo-commits@lists.gentoo.org
X-Archives-Salt: 63db567b-28f2-4ef6-8292-01a2b7fdef2c
X-Archives-Hash: ad780b153ebe611b5f93cd9b21265651

commit:     5a2a504d74eaf43aa0b3bf953836a44e0bb40b28
Author:     Alice Ferrazzi <alicef <AT> gentoo <DOT> org>
AuthorDate: Mon Dec 25 14:27:59 2017 +0000
Commit:     Alice Ferrazzi <alicef <AT> gentoo <DOT> org>
CommitDate: Mon Dec 25 14:27:59 2017 +0000
URL:        https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=5a2a504d

linux kernel 4.14.9

 0000_README             |     4 +
 1008_linux-4.14.9.patch | 10292 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 10296 insertions(+)
diff --git a/0000_README b/0000_README
index 9a6da0f..a19ca77 100644
--- a/0000_README
+++ b/0000_README
@@ -75,6 +75,10 @@ Patch:  1007_linux-4.14.8.patch
 From:   http://www.kernel.org
 Desc:   Linux 4.14.8
 
+Patch:  1008_linux-4.14.9.patch
+From:   http://www.kernel.org
+Desc:   Linux 4.14.9
+
 Patch:  1500_XATTR_USER_PREFIX.patch
 From:   https://bugs.gentoo.org/show_bug.cgi?id=470644
 Desc:   Support for namespace user.pax.* on tmpfs.

diff --git a/1008_linux-4.14.9.patch b/1008_linux-4.14.9.patch
new file mode 100644
index 0000000..006dbd5
--- /dev/null
+++ b/1008_linux-4.14.9.patch
@@ -0,0 +1,10292 @@
+diff --git a/Documentation/x86/orc-unwinder.txt b/Documentation/x86/orc-unwinder.txt
+index af0c9a4c65a6..cd4b29be29af 100644
+--- a/Documentation/x86/orc-unwinder.txt
++++ b/Documentation/x86/orc-unwinder.txt
+@@ -4,7 +4,7 @@ ORC unwinder
+ Overview
+ --------
+ 
+-The kernel CONFIG_ORC_UNWINDER option enables the ORC unwinder, which is
++The kernel CONFIG_UNWINDER_ORC option enables the ORC unwinder, which is
+ similar in concept to a DWARF unwinder.  The difference is that the
+ format of the ORC data is much simpler than DWARF, which in turn allows
+ the ORC unwinder to be much simpler and faster.
+diff --git a/Documentation/x86/x86_64/mm.txt b/Documentation/x86/x86_64/mm.txt
+index b0798e281aa6..3448e675b462 100644
+--- a/Documentation/x86/x86_64/mm.txt
++++ b/Documentation/x86/x86_64/mm.txt
+@@ -34,7 +34,7 @@ ff92000000000000 - ffd1ffffffffffff (=54 bits) vmalloc/ioremap space
+ ffd2000000000000 - ffd3ffffffffffff (=49 bits) hole
+ ffd4000000000000 - ffd5ffffffffffff (=49 bits) virtual memory map (512TB)
+ ... unused hole ...
+-ffd8000000000000 - fff7ffffffffffff (=53 bits) kasan shadow memory (8PB)
++ffdf000000000000 - fffffc0000000000 (=53 bits) kasan shadow memory (8PB)
+ ... unused hole ...
+ ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
+ ... unused hole ...
+diff --git a/Makefile b/Makefile
+index 97b5ae76ac8c..ed2132c6d286 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 4
+ PATCHLEVEL = 14
+-SUBLEVEL = 8
++SUBLEVEL = 9
+ EXTRAVERSION =
+ NAME = Petit Gorille
+ 
+@@ -935,8 +935,8 @@ ifdef CONFIG_STACK_VALIDATION
+   ifeq ($(has_libelf),1)
+     objtool_target := tools/objtool FORCE
+   else
+-    ifdef CONFIG_ORC_UNWINDER
+-      $(error "Cannot generate ORC metadata for CONFIG_ORC_UNWINDER=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel")
++    ifdef CONFIG_UNWINDER_ORC
++      $(error "Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel")
+     else
+       $(warning "Cannot use CONFIG_STACK_VALIDATION=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel")
+     endif
+diff --git a/arch/arm/configs/exynos_defconfig b/arch/arm/configs/exynos_defconfig
+index 8c2a2619971b..f1d7834990ec 100644
+--- a/arch/arm/configs/exynos_defconfig
++++ b/arch/arm/configs/exynos_defconfig
+@@ -244,7 +244,7 @@ CONFIG_USB_STORAGE_ONETOUCH=m
+ CONFIG_USB_STORAGE_KARMA=m
+ CONFIG_USB_STORAGE_CYPRESS_ATACB=m
+ CONFIG_USB_STORAGE_ENE_UB6250=m
+-CONFIG_USB_UAS=m
++CONFIG_USB_UAS=y
+ CONFIG_USB_DWC3=y
+ CONFIG_USB_DWC2=y
+ CONFIG_USB_HSIC_USB3503=y
+diff --git a/arch/arm/include/asm/ptrace.h b/arch/arm/include/asm/ptrace.h
+index e9c9a117bd25..c7cdbb43ae7c 100644
+--- a/arch/arm/include/asm/ptrace.h
++++ b/arch/arm/include/asm/ptrace.h
+@@ -126,8 +126,7 @@ extern unsigned long profile_pc(struct pt_regs *regs);
+ /*
+  * kprobe-based event tracer support
+  */
+-#include <linux/stddef.h>
+-#include <linux/types.h>
++#include <linux/compiler.h>
+ #define MAX_REG_OFFSET (offsetof(struct pt_regs, ARM_ORIG_r0))
+ 
+ extern int regs_query_register_offset(const char *name);
+diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h
+index caf86be815ba..4052ec39e8db 100644
+--- a/arch/arm64/include/asm/fixmap.h
++++ b/arch/arm64/include/asm/fixmap.h
+@@ -51,6 +51,13 @@ enum fixed_addresses {
+ 
+ 	FIX_EARLYCON_MEM_BASE,
+ 	FIX_TEXT_POKE0,
++
++#ifdef CONFIG_ACPI_APEI_GHES
++	/* Used for GHES mapping from assorted contexts */
++	FIX_APEI_GHES_IRQ,
++	FIX_APEI_GHES_NMI,
++#endif /* CONFIG_ACPI_APEI_GHES */
++
+ 	__end_of_permanent_fixed_addresses,
+ 
+ 	/*
+diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
+index 57190f384f63..ce848ff84edd 100644
+--- a/arch/powerpc/kernel/watchdog.c
++++ b/arch/powerpc/kernel/watchdog.c
+@@ -276,9 +276,12 @@ void arch_touch_nmi_watchdog(void)
+ {
+ 	unsigned long ticks = tb_ticks_per_usec * wd_timer_period_ms * 1000;
+ 	int cpu = smp_processor_id();
++	u64 tb = get_tb();
+ 
+-	if (get_tb() - per_cpu(wd_timer_tb, cpu) >= ticks)
+-		watchdog_timer_interrupt(cpu);
++	if (tb - per_cpu(wd_timer_tb, cpu) >= ticks) {
++		per_cpu(wd_timer_tb, cpu) = tb;
++		wd_smp_clear_cpu_pending(cpu, tb);
++	}
+ }
+ EXPORT_SYMBOL(arch_touch_nmi_watchdog);
+ 
+diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
+index a66e64b0b251..5d115bd32539 100644
+--- a/arch/powerpc/net/bpf_jit_comp64.c
++++ b/arch/powerpc/net/bpf_jit_comp64.c
+@@ -762,7 +762,8 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 *image,
+ 			func = (u8 *) __bpf_call_base + imm;
+ 
+ 			/* Save skb pointer if we need to re-cache skb data */
+-			if (bpf_helper_changes_pkt_data(func))
++			if ((ctx->seen & SEEN_SKB) &&
++			    bpf_helper_changes_pkt_data(func))
+ 				PPC_BPF_STL(3, 1, bpf_jit_stack_local(ctx));
+ 
+ 			bpf_jit_emit_func_call(image, ctx, (u64)func);
+@@ -771,7 +772,8 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 *image,
+ 			PPC_MR(b2p[BPF_REG_0], 3);
+ 
+ 			/* refresh skb cache */
+-			if (bpf_helper_changes_pkt_data(func)) {
++			if ((ctx->seen & SEEN_SKB) &&
++			    bpf_helper_changes_pkt_data(func)) {
+ 				/* reload skb pointer to r3 */
+ 				PPC_BPF_LL(3, 1, bpf_jit_stack_local(ctx));
+ 				bpf_jit_emit_skb_loads(image, ctx);
+diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
+index c008083fbc4f..2c8b325591cc 100644
+--- a/arch/powerpc/xmon/xmon.c
++++ b/arch/powerpc/xmon/xmon.c
+@@ -530,14 +530,19 @@ static int xmon_core(struct pt_regs *regs, int fromipi)
+ 
+  waiting:
+ 	secondary = 1;
++	spin_begin();
+ 	while (secondary && !xmon_gate) {
+ 		if (in_xmon == 0) {
+-			if (fromipi)
++			if (fromipi) {
++				spin_end();
+ 				goto leave;
++			}
+ 			secondary = test_and_set_bit(0, &in_xmon);
+ 		}
+-		barrier();
++		spin_cpu_relax();
++		touch_nmi_watchdog();
+ 	}
++	spin_end();
+ 
+ 	if (!secondary && !xmon_gate) {
+ 		/* we are the first cpu to come in */
+@@ -568,21 +573,25 @@ static int xmon_core(struct pt_regs *regs, int fromipi)
+ 		mb();
+ 		xmon_gate = 1;
+ 		barrier();
++		touch_nmi_watchdog();
+ 	}
+ 
+  cmdloop:
+ 	while (in_xmon) {
+ 		if (secondary) {
++			spin_begin();
+ 			if (cpu == xmon_owner) {
+ 				if (!test_and_set_bit(0, &xmon_taken)) {
+ 					secondary = 0;
++					spin_end();
+ 					continue;
+ 				}
+ 				/* missed it */
+ 				while (cpu == xmon_owner)
+-					barrier();
++					spin_cpu_relax();
+ 			}
+-			barrier();
++			spin_cpu_relax();
++			touch_nmi_watchdog();
+ 		} else {
+ 			cmd = cmds(regs);
+ 			if (cmd != 0) {
+diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
+index b15cd2f0320f..33e2785f6842 100644
+--- a/arch/s390/net/bpf_jit_comp.c
++++ b/arch/s390/net/bpf_jit_comp.c
+@@ -55,8 +55,7 @@ struct bpf_jit {
+ #define SEEN_LITERAL	8	/* code uses literals */
+ #define SEEN_FUNC	16	/* calls C functions */
+ #define SEEN_TAIL_CALL	32	/* code uses tail calls */
+-#define SEEN_SKB_CHANGE	64	/* code changes skb data */
+-#define SEEN_REG_AX	128	/* code uses constant blinding */
++#define SEEN_REG_AX	64	/* code uses constant blinding */
+ #define SEEN_STACK	(SEEN_FUNC | SEEN_MEM | SEEN_SKB)
+ 
+ /*
+@@ -448,12 +447,12 @@ static void bpf_jit_prologue(struct bpf_jit *jit)
+ 			EMIT6_DISP_LH(0xe3000000, 0x0024, REG_W1, REG_0,
+ 				      REG_15, 152);
+ 	}
+-	if (jit->seen & SEEN_SKB)
++	if (jit->seen & SEEN_SKB) {
+ 		emit_load_skb_data_hlen(jit);
+-	if (jit->seen & SEEN_SKB_CHANGE)
+ 		/* stg %b1,ST_OFF_SKBP(%r0,%r15) */
+ 		EMIT6_DISP_LH(0xe3000000, 0x0024, BPF_REG_1, REG_0, REG_15,
+ 			      STK_OFF_SKBP);
++	}
+ }
+ 
+ /*
+@@ -983,8 +982,8 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, struct bpf_prog *fp, int i
+ 		EMIT2(0x0d00, REG_14, REG_W1);
+ 		/* lgr %b0,%r2: load return value into %b0 */
+ 		EMIT4(0xb9040000, BPF_REG_0, REG_2);
+-		if (bpf_helper_changes_pkt_data((void *)func)) {
+-			jit->seen |= SEEN_SKB_CHANGE;
++		if ((jit->seen & SEEN_SKB) &&
++		    bpf_helper_changes_pkt_data((void *)func)) {
+ 			/* lg %b1,ST_OFF_SKBP(%r15) */
+ 			EMIT6_DISP_LH(0xe3000000, 0x0004, BPF_REG_1, REG_0,
+ 				      REG_15, STK_OFF_SKBP);
+diff --git a/arch/sparc/include/asm/ptrace.h b/arch/sparc/include/asm/ptrace.h
+index 6a339a78f4f4..71dd82b43cc5 100644
+--- a/arch/sparc/include/asm/ptrace.h
++++ b/arch/sparc/include/asm/ptrace.h
+@@ -7,6 +7,7 @@
+ #if defined(__sparc__) && defined(__arch64__)
+ #ifndef __ASSEMBLY__
+ 
++#include <linux/compiler.h>
+ #include <linux/threads.h>
+ #include <asm/switch_to.h>
+ 
+diff --git a/arch/sparc/net/bpf_jit_comp_64.c b/arch/sparc/net/bpf_jit_comp_64.c
+index 5765e7e711f7..ff5f9cb3039a 100644
+--- a/arch/sparc/net/bpf_jit_comp_64.c
++++ b/arch/sparc/net/bpf_jit_comp_64.c
+@@ -1245,14 +1245,16 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
+ 		u8 *func = ((u8 *)__bpf_call_base) + imm;
+ 
+ 		ctx->saw_call = true;
++		if (ctx->saw_ld_abs_ind && bpf_helper_changes_pkt_data(func))
++			emit_reg_move(bpf2sparc[BPF_REG_1], L7, ctx);
+ 
+ 		emit_call((u32 *)func, ctx);
+ 		emit_nop(ctx);
+ 
+ 		emit_reg_move(O0, bpf2sparc[BPF_REG_0], ctx);
+ 
+-		if (bpf_helper_changes_pkt_data(func) && ctx->saw_ld_abs_ind)
+-			load_skb_regs(ctx, bpf2sparc[BPF_REG_6]);
++		if (ctx->saw_ld_abs_ind && bpf_helper_changes_pkt_data(func))
++			load_skb_regs(ctx, L7);
+ 		break;
+ 	}
+ 
+diff --git a/arch/um/include/asm/Kbuild b/arch/um/include/asm/Kbuild
+index 50a32c33d729..73c57f614c9e 100644
+--- a/arch/um/include/asm/Kbuild
++++ b/arch/um/include/asm/Kbuild
+@@ -1,4 +1,5 @@
+ generic-y += barrier.h
++generic-y += bpf_perf_event.h
+ generic-y += bug.h
+ generic-y += clkdev.h
+ generic-y += current.h
+diff --git a/arch/um/include/shared/init.h b/arch/um/include/shared/init.h
+index 390572daa40d..b3f5865a92c9 100644
+--- a/arch/um/include/shared/init.h
++++ b/arch/um/include/shared/init.h
+@@ -41,7 +41,7 @@
+ typedef int (*initcall_t)(void);
+ typedef void (*exitcall_t)(void);
+ 
+-#include <linux/compiler.h>
++#include <linux/compiler_types.h>
+ 
+ /* These are for everybody (although not all archs will actually
+    discard it in modules) */
+diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
+index 9bceea6a5852..48646160eb83 100644
+--- a/arch/x86/Kconfig
++++ b/arch/x86/Kconfig
+@@ -108,7 +108,7 @@ config X86
+ 	select HAVE_ARCH_AUDITSYSCALL
+ 	select HAVE_ARCH_HUGE_VMAP		if X86_64 || X86_PAE
+ 	select HAVE_ARCH_JUMP_LABEL
+-	select HAVE_ARCH_KASAN			if X86_64 && SPARSEMEM_VMEMMAP
++	select HAVE_ARCH_KASAN			if X86_64
+ 	select HAVE_ARCH_KGDB
+ 	select HAVE_ARCH_KMEMCHECK
+ 	select HAVE_ARCH_MMAP_RND_BITS		if MMU
+@@ -171,7 +171,7 @@ config X86
+ 	select HAVE_PERF_USER_STACK_DUMP
+ 	select HAVE_RCU_TABLE_FREE
+ 	select HAVE_REGS_AND_STACK_ACCESS_API
+-	select HAVE_RELIABLE_STACKTRACE		if X86_64 && FRAME_POINTER_UNWINDER && STACK_VALIDATION
++	select HAVE_RELIABLE_STACKTRACE		if X86_64 && UNWINDER_FRAME_POINTER && STACK_VALIDATION
+ 	select HAVE_STACK_VALIDATION		if X86_64
+ 	select HAVE_SYSCALL_TRACEPOINTS
+ 	select HAVE_UNSTABLE_SCHED_CLOCK
+@@ -303,7 +303,6 @@ config ARCH_SUPPORTS_DEBUG_PAGEALLOC
+ config KASAN_SHADOW_OFFSET
+ 	hex
+ 	depends on KASAN
+-	default 0xdff8000000000000 if X86_5LEVEL
+ 	default 0xdffffc0000000000
+ 
+ config HAVE_INTEL_TXT
+diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug
+index 90b123056f4b..6293a8768a91 100644
+--- a/arch/x86/Kconfig.debug
++++ b/arch/x86/Kconfig.debug
+@@ -359,28 +359,14 @@ config PUNIT_ATOM_DEBUG
+ 
+ choice
+ 	prompt "Choose kernel unwinder"
+-	default FRAME_POINTER_UNWINDER
++	default UNWINDER_ORC if X86_64
++	default UNWINDER_FRAME_POINTER if X86_32
+ 	---help---
+ 	  This determines which method will be used for unwinding kernel stack
+ 	  traces for panics, oopses, bugs, warnings, perf, /proc/<pid>/stack,
+ 	  livepatch, lockdep, and more.
+ 
+-config FRAME_POINTER_UNWINDER
+-	bool "Frame pointer unwinder"
+-	select FRAME_POINTER
+-	---help---
+-	  This option enables the frame pointer unwinder for unwinding kernel
+-	  stack traces.
+-
+-	  The unwinder itself is fast and it uses less RAM than the ORC
+-	  unwinder, but the kernel text size will grow by ~3% and the kernel's
+-	  overall performance will degrade by roughly 5-10%.
+-
+-	  This option is recommended if you want to use the livepatch
+-	  consistency model, as this is currently the only way to get a
+-	  reliable stack trace (CONFIG_HAVE_RELIABLE_STACKTRACE).
+-
+-config ORC_UNWINDER
++config UNWINDER_ORC
+ 	bool "ORC unwinder"
+ 	depends on X86_64
+ 	select STACK_VALIDATION
+@@ -396,7 +382,22 @@ config ORC_UNWINDER
+ 	  Enabling this option will increase the kernel's runtime memory usage
+ 	  by roughly 2-4MB, depending on your kernel config.
+ 
+-config GUESS_UNWINDER
++config UNWINDER_FRAME_POINTER
++	bool "Frame pointer unwinder"
++	select FRAME_POINTER
++	---help---
++	  This option enables the frame pointer unwinder for unwinding kernel
++	  stack traces.
++
++	  The unwinder itself is fast and it uses less RAM than the ORC
++	  unwinder, but the kernel text size will grow by ~3% and the kernel's
++	  overall performance will degrade by roughly 5-10%.
++
++	  This option is recommended if you want to use the livepatch
++	  consistency model, as this is currently the only way to get a
++	  reliable stack trace (CONFIG_HAVE_RELIABLE_STACKTRACE).
++
++config UNWINDER_GUESS
+ 	bool "Guess unwinder"
+ 	depends on EXPERT
+ 	---help---
+@@ -411,7 +412,7 @@ config GUESS_UNWINDER
+ endchoice
+ 
+ config FRAME_POINTER
+-	depends on !ORC_UNWINDER && !GUESS_UNWINDER
++	depends on !UNWINDER_ORC && !UNWINDER_GUESS
+ 	bool
+ 
+ endmenu
+diff --git a/arch/x86/configs/tiny.config b/arch/x86/configs/tiny.config
+index 550cd5012b73..66c9e2aab16c 100644
+--- a/arch/x86/configs/tiny.config
++++ b/arch/x86/configs/tiny.config
+@@ -1,5 +1,5 @@
+ CONFIG_NOHIGHMEM=y
+ # CONFIG_HIGHMEM4G is not set
+ # CONFIG_HIGHMEM64G is not set
+-CONFIG_GUESS_UNWINDER=y
+-# CONFIG_FRAME_POINTER_UNWINDER is not set
++CONFIG_UNWINDER_GUESS=y
++# CONFIG_UNWINDER_FRAME_POINTER is not set
+diff --git a/arch/x86/configs/x86_64_defconfig b/arch/x86/configs/x86_64_defconfig
+index 4a4b16e56d35..e32fc1f274d8 100644
+--- a/arch/x86/configs/x86_64_defconfig
++++ b/arch/x86/configs/x86_64_defconfig
+@@ -299,6 +299,7 @@ CONFIG_DEBUG_STACKOVERFLOW=y
+ # CONFIG_DEBUG_RODATA_TEST is not set
+ CONFIG_DEBUG_BOOT_PARAMS=y
+ CONFIG_OPTIMIZE_INLINING=y
++CONFIG_UNWINDER_ORC=y
+ CONFIG_SECURITY=y
+ CONFIG_SECURITY_NETWORK=y
+ CONFIG_SECURITY_SELINUX=y
+diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
+index 6e160031cfea..3fd8bc560fae 100644
+--- a/arch/x86/entry/calling.h
++++ b/arch/x86/entry/calling.h
+@@ -142,56 +142,25 @@ For 32-bit we have the following conventions - kernel is built with
+ 	UNWIND_HINT_REGS offset=\offset
+ 	.endm
+ 
+-	.macro RESTORE_EXTRA_REGS offset=0
+-	movq 0*8+\offset(%rsp), %r15
+-	movq 1*8+\offset(%rsp), %r14
+-	movq 2*8+\offset(%rsp), %r13
+-	movq 3*8+\offset(%rsp), %r12
+-	movq 4*8+\offset(%rsp), %rbp
+-	movq 5*8+\offset(%rsp), %rbx
+-	UNWIND_HINT_REGS offset=\offset extra=0
+-	.endm
+-
+-	.macro RESTORE_C_REGS_HELPER rstor_rax=1, rstor_rcx=1, rstor_r11=1, rstor_r8910=1, rstor_rdx=1
+-	.if \rstor_r11
+-	movq 6*8(%rsp), %r11
+-	.endif
+-	.if \rstor_r8910
+-	movq 7*8(%rsp), %r10
+-	movq 8*8(%rsp), %r9
+-	movq 9*8(%rsp), %r8
+-	.endif
+-	.if \rstor_rax
+-	movq 10*8(%rsp), %rax
+-	.endif
+-	.if \rstor_rcx
+-	movq 11*8(%rsp), %rcx
+-	.endif
+-	.if \rstor_rdx
+-	movq 12*8(%rsp), %rdx
+-	.endif
+-	movq 13*8(%rsp), %rsi
+-	movq 14*8(%rsp), %rdi
+-	UNWIND_HINT_IRET_REGS offset=16*8
+-	.endm
+-	.macro RESTORE_C_REGS
+-	RESTORE_C_REGS_HELPER 1,1,1,1,1
+-	.endm
+-	.macro RESTORE_C_REGS_EXCEPT_RAX
+-	RESTORE_C_REGS_HELPER 0,1,1,1,1
+-	.endm
+-	.macro RESTORE_C_REGS_EXCEPT_RCX
+-	RESTORE_C_REGS_HELPER 1,0,1,1,1
+-	.endm
+-	.macro RESTORE_C_REGS_EXCEPT_R11
+-	RESTORE_C_REGS_HELPER 1,1,0,1,1
+-	.endm
+-	.macro RESTORE_C_REGS_EXCEPT_RCX_R11
+-	RESTORE_C_REGS_HELPER 1,0,0,1,1
+-	.endm
+-
+-	.macro REMOVE_PT_GPREGS_FROM_STACK addskip=0
+-	subq $-(15*8+\addskip), %rsp
++	.macro POP_EXTRA_REGS
++	popq %r15
++	popq %r14
++	popq %r13
++	popq %r12
++	popq %rbp
++	popq %rbx
++	.endm
++
++	.macro POP_C_REGS
++	popq %r11
++	popq %r10
++	popq %r9
++	popq %r8
++	popq %rax
++	popq %rcx
++	popq %rdx
++	popq %rsi
++	popq %rdi
+ 	.endm
+ 
+ 	.macro icebp
+diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
+index 4838037f97f6..bd8b57a5c874 100644
+--- a/arch/x86/entry/entry_32.S
++++ b/arch/x86/entry/entry_32.S
+@@ -941,7 +941,8 @@ ENTRY(debug)
+ 	movl	%esp, %eax			# pt_regs pointer
+ 
+ 	/* Are we currently on the SYSENTER stack? */
+-	PER_CPU(cpu_tss + CPU_TSS_SYSENTER_stack + SIZEOF_SYSENTER_stack, %ecx)
++	movl	PER_CPU_VAR(cpu_entry_area), %ecx
++	addl	$CPU_ENTRY_AREA_SYSENTER_stack + SIZEOF_SYSENTER_stack, %ecx
+ 	subl	%eax, %ecx	/* ecx = (end of SYSENTER_stack) - esp */
+ 	cmpl	$SIZEOF_SYSENTER_stack, %ecx
+ 	jb	.Ldebug_from_sysenter_stack
+@@ -984,7 +985,8 @@ ENTRY(nmi)
+ 	movl	%esp, %eax			# pt_regs pointer
+ 
+ 	/* Are we currently on the SYSENTER stack? */
+-	PER_CPU(cpu_tss + CPU_TSS_SYSENTER_stack + SIZEOF_SYSENTER_stack, %ecx)
++	movl	PER_CPU_VAR(cpu_entry_area), %ecx
++	addl	$CPU_ENTRY_AREA_SYSENTER_stack + SIZEOF_SYSENTER_stack, %ecx
+ 	subl	%eax, %ecx	/* ecx = (end of SYSENTER_stack) - esp */
+ 	cmpl	$SIZEOF_SYSENTER_stack, %ecx
+ 	jb	.Lnmi_from_sysenter_stack
+diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
+index 2e956afe272c..6abe3fcaece9 100644
+--- a/arch/x86/entry/entry_64.S
++++ b/arch/x86/entry/entry_64.S
+@@ -136,6 +136,64 @@ END(native_usergs_sysret64)
+  * with them due to bugs in both AMD and Intel CPUs.
+  */
+ 
++	.pushsection .entry_trampoline, "ax"
++
++/*
++ * The code in here gets remapped into cpu_entry_area's trampoline.  This means
++ * that the assembler and linker have the wrong idea as to where this code
++ * lives (and, in fact, it's mapped more than once, so it's not even at a
++ * fixed address).  So we can't reference any symbols outside the entry
++ * trampoline and expect it to work.
++ *
++ * Instead, we carefully abuse %rip-relative addressing.
++ * _entry_trampoline(%rip) refers to the start of the remapped) entry
++ * trampoline.  We can thus find cpu_entry_area with this macro:
++ */
++
++#define CPU_ENTRY_AREA \
++	_entry_trampoline - CPU_ENTRY_AREA_entry_trampoline(%rip)
++
++/* The top word of the SYSENTER stack is hot and is usable as scratch space. */
++#define RSP_SCRATCH	CPU_ENTRY_AREA_SYSENTER_stack + \
++			SIZEOF_SYSENTER_stack - 8 + CPU_ENTRY_AREA
++
++ENTRY(entry_SYSCALL_64_trampoline)
++	UNWIND_HINT_EMPTY
++	swapgs
++
++	/* Stash the user RSP. */
++	movq	%rsp, RSP_SCRATCH
++
++	/* Load the top of the task stack into RSP */
++	movq	CPU_ENTRY_AREA_tss + TSS_sp1 + CPU_ENTRY_AREA, %rsp
++
++	/* Start building the simulated IRET frame. */
++	pushq	$__USER_DS			/* pt_regs->ss */
++	pushq	RSP_SCRATCH			/* pt_regs->sp */
++	pushq	%r11				/* pt_regs->flags */
++	pushq	$__USER_CS			/* pt_regs->cs */
++	pushq	%rcx				/* pt_regs->ip */
++
++	/*
++	 * x86 lacks a near absolute jump, and we can't jump to the real
++	 * entry text with a relative jump.  We could push the target
++	 * address and then use retq, but this destroys the pipeline on
++	 * many CPUs (wasting over 20 cycles on Sandy Bridge).  Instead,
++	 * spill RDI and restore it in a second-stage trampoline.
++	 */
++	pushq	%rdi
++	movq	$entry_SYSCALL_64_stage2, %rdi
++	jmp	*%rdi
++END(entry_SYSCALL_64_trampoline)
++
++	.popsection
++
++ENTRY(entry_SYSCALL_64_stage2)
++	UNWIND_HINT_EMPTY
++	popq	%rdi
++	jmp	entry_SYSCALL_64_after_hwframe
++END(entry_SYSCALL_64_stage2)
++
+ ENTRY(entry_SYSCALL_64)
+ 	UNWIND_HINT_EMPTY
+ 	/*
+@@ -221,10 +279,9 @@ entry_SYSCALL_64_fastpath:
+ 	TRACE_IRQS_ON		/* user mode is traced as IRQs on */
+ 	movq	RIP(%rsp), %rcx
+ 	movq	EFLAGS(%rsp), %r11
+-	RESTORE_C_REGS_EXCEPT_RCX_R11
+-	movq	RSP(%rsp), %rsp
++	addq	$6*8, %rsp	/* skip extra regs -- they were preserved */
+ 	UNWIND_HINT_EMPTY
+-	USERGS_SYSRET64
++	jmp	.Lpop_c_regs_except_rcx_r11_and_sysret
+ 
+ 1:
+ 	/*
+@@ -246,17 +303,18 @@ entry_SYSCALL64_slow_path:
+ 	call	do_syscall_64		/* returns with IRQs disabled */
+ 
+ return_from_SYSCALL_64:
+-	RESTORE_EXTRA_REGS
+ 	TRACE_IRQS_IRETQ		/* we're about to change IF */
+ 
+ 	/*
+ 	 * Try to use SYSRET instead of IRET if we're returning to
+-	 * a completely clean 64-bit userspace context.
++	 * a completely clean 64-bit userspace context.  If we're not,
++	 * go to the slow exit path.
+ 	 */
+ 	movq	RCX(%rsp), %rcx
+ 	movq	RIP(%rsp), %r11
+-	cmpq	%rcx, %r11			/* RCX == RIP */
+-	jne	opportunistic_sysret_failed
++
++	cmpq	%rcx, %r11	/* SYSRET requires RCX == RIP */
++	jne	swapgs_restore_regs_and_return_to_usermode
+ 
+ 	/*
+ 	 * On Intel CPUs, SYSRET with non-canonical RCX/RIP will #GP
+@@ -274,14 +332,14 @@ return_from_SYSCALL_64:
+ 
+ 	/* If this changed %rcx, it was not canonical */
+ 	cmpq	%rcx, %r11
+-	jne	opportunistic_sysret_failed
++	jne	swapgs_restore_regs_and_return_to_usermode
+ 
+ 	cmpq	$__USER_CS, CS(%rsp)		/* CS must match SYSRET */
+-	jne	opportunistic_sysret_failed
++	jne	swapgs_restore_regs_and_return_to_usermode
+ 
+ 	movq	R11(%rsp), %r11
+ 	cmpq	%r11, EFLAGS(%rsp)		/* R11 == RFLAGS */
+-	jne	opportunistic_sysret_failed
++	jne	swapgs_restore_regs_and_return_to_usermode
+ 
+ 	/*
+ 	 * SYSCALL clears RF when it saves RFLAGS in R11 and SYSRET cannot
+@@ -302,12 +360,12 @@ return_from_SYSCALL_64:
+ 	 * would never get past 'stuck_here'.
+ 	 */
+ 	testq	$(X86_EFLAGS_RF|X86_EFLAGS_TF), %r11
+-	jnz	opportunistic_sysret_failed
++	jnz	swapgs_restore_regs_and_return_to_usermode
+ 
+ 	/* nothing to check for RSP */
+ 
+ 	cmpq	$__USER_DS, SS(%rsp)		/* SS must match SYSRET */
+-	jne	opportunistic_sysret_failed
++	jne	swapgs_restore_regs_and_return_to_usermode
+ 
+ 	/*
+ 	 * We win! This label is here just for ease of understanding
+@@ -315,14 +373,36 @@ return_from_SYSCALL_64:
+ 	 */
+ syscall_return_via_sysret:
+ 	/* rcx and r11 are already restored (see code above) */
+-	RESTORE_C_REGS_EXCEPT_RCX_R11
+-	movq	RSP(%rsp), %rsp
+ 	UNWIND_HINT_EMPTY
+-	USERGS_SYSRET64
++	POP_EXTRA_REGS
++.Lpop_c_regs_except_rcx_r11_and_sysret:
++	popq	%rsi	/* skip r11 */
++	popq	%r10
++	popq	%r9
++	popq	%r8
++	popq	%rax
++	popq	%rsi	/* skip rcx */
++	popq	%rdx
++	popq	%rsi
+ 
+-opportunistic_sysret_failed:
+-	SWAPGS
+-	jmp	restore_c_regs_and_iret
++	/*
++	 * Now all regs are restored except RSP and RDI.
++	 * Save old stack pointer and switch to trampoline stack.
++	 */
++	movq	%rsp, %rdi
++	movq	PER_CPU_VAR(cpu_tss_rw + TSS_sp0), %rsp
++
++	pushq	RSP-RDI(%rdi)	/* RSP */
++	pushq	(%rdi)		/* RDI */
++
++	/*
++	 * We are on the trampoline stack.  All regs except RDI are live.
++	 * We can do future final exit work right here.
++	 */
++
++	popq	%rdi
++	popq	%rsp
++	USERGS_SYSRET64
+ END(entry_SYSCALL_64)
+ 
+ ENTRY(stub_ptregs_64)
+@@ -423,8 +503,7 @@ ENTRY(ret_from_fork)
+ 	movq	%rsp, %rdi
+ 	call	syscall_return_slowpath	/* returns with IRQs disabled */
+ 	TRACE_IRQS_ON			/* user mode is traced as IRQS on */
+-	SWAPGS
+-	jmp	restore_regs_and_iret
++	jmp	swapgs_restore_regs_and_return_to_usermode
+ 
+ 1:
+ 	/* kernel thread */
+@@ -457,12 +536,13 @@ END(irq_entries_start)
+ 
+ .macro DEBUG_ENTRY_ASSERT_IRQS_OFF
+ #ifdef CONFIG_DEBUG_ENTRY
+-	pushfq
+-	testl $X86_EFLAGS_IF, (%rsp)
++	pushq %rax
++	SAVE_FLAGS(CLBR_RAX)
++	testl $X86_EFLAGS_IF, %eax
+ 	jz .Lokay_\@
+ 	ud2
+ .Lokay_\@:
+-	addq $8, %rsp
++	popq %rax
+ #endif
+ .endm
+ 
+@@ -554,6 +634,13 @@ END(irq_entries_start)
+ /* 0(%rsp): ~(interrupt number) */
+ 	.macro interrupt func
+ 	cld
++
++	testb	$3, CS-ORIG_RAX(%rsp)
++	jz	1f
++	SWAPGS
++	call	switch_to_thread_stack
++1:
++
+ 	ALLOC_PT_GPREGS_ON_STACK
+ 	SAVE_C_REGS
+ 	SAVE_EXTRA_REGS
+@@ -563,12 +650,8 @@ END(irq_entries_start)
+ 	jz	1f
+ 
+ 	/*
+-	 * IRQ from user mode.  Switch to kernel gsbase and inform context
+-	 * tracking that we're in kernel mode.
+-	 */
+-	SWAPGS
+-
+-	/*
++	 * IRQ from user mode.
++	 *
+ 	 * We need to tell lockdep that IRQs are off.  We can't do this until
+ 	 * we fix gsbase, and we should do it before enter_from_user_mode
+ 	 * (which can take locks).  Since TRACE_IRQS_OFF idempotent,
+@@ -612,8 +695,52 @@ GLOBAL(retint_user)
+ 	mov	%rsp,%rdi
+ 	call	prepare_exit_to_usermode
+ 	TRACE_IRQS_IRETQ
++
++GLOBAL(swapgs_restore_regs_and_return_to_usermode)
++#ifdef CONFIG_DEBUG_ENTRY
++	/* Assert that pt_regs indicates user mode. */
++	testb	$3, CS(%rsp)
++	jnz	1f
++	ud2
++1:
++#endif
++	POP_EXTRA_REGS
++	popq	%r11
++	popq	%r10
++	popq	%r9
++	popq	%r8
++	popq	%rax
++	popq	%rcx
++	popq	%rdx
++	popq	%rsi
++
++	/*
++	 * The stack is now user RDI, orig_ax, RIP, CS, EFLAGS, RSP, SS.
++	 * Save old stack pointer and switch to trampoline stack.
++	 */
++	movq	%rsp, %rdi
++	movq	PER_CPU_VAR(cpu_tss_rw + TSS_sp0), %rsp
++
++	/* Copy the IRET frame to the trampoline stack. */
++	pushq	6*8(%rdi)	/* SS */
++	pushq	5*8(%rdi)	/* RSP */
++	pushq	4*8(%rdi)	/* EFLAGS */
++	pushq	3*8(%rdi)	/* CS */
++	pushq	2*8(%rdi)	/* RIP */
++
++	/* Push user RDI on the trampoline stack. */
++	pushq	(%rdi)
++
++	/*
++	 * We are on the trampoline stack.  All regs except RDI are live.
++	 * We can do future final exit work right here.
++	 */
++
++	/* Restore RDI. */
++	popq	%rdi
+ 	SWAPGS
+-	jmp	restore_regs_and_iret
++	INTERRUPT_RETURN
++
+ 
+ /* Returning to kernel space */
+ retint_kernel:
+@@ -633,15 +760,17 @@ retint_kernel:
+ 	 */
+ 	TRACE_IRQS_IRETQ
+ 
+-/*
+- * At this label, code paths which return to kernel and to user,
+- * which come from interrupts/exception and from syscalls, merge.
+- */
+-GLOBAL(restore_regs_and_iret)
+-	RESTORE_EXTRA_REGS
+-restore_c_regs_and_iret:
+-	RESTORE_C_REGS
+-	REMOVE_PT_GPREGS_FROM_STACK 8
++GLOBAL(restore_regs_and_return_to_kernel)
++#ifdef CONFIG_DEBUG_ENTRY
++	/* Assert that pt_regs indicates kernel mode. */
++	testb	$3, CS(%rsp)
++	jz	1f
++	ud2
++1:
++#endif
++	POP_EXTRA_REGS
++	POP_C_REGS
++	addq	$8, %rsp	/* skip regs->orig_ax */
+ 	INTERRUPT_RETURN
+ 
+ ENTRY(native_iret)
+@@ -805,7 +934,33 @@ apicinterrupt IRQ_WORK_VECTOR			irq_work_interrupt		smp_irq_work_interrupt
+ /*
+  * Exception entry points.
+  */
+-#define CPU_TSS_IST(x) PER_CPU_VAR(cpu_tss) + (TSS_ist + ((x) - 1) * 8)
++#define CPU_TSS_IST(x) PER_CPU_VAR(cpu_tss_rw) + (TSS_ist + ((x) - 1) * 8)
++
++/*
++ * Switch to the thread stack.  This is called with the IRET frame and
++ * orig_ax on the stack.  (That is, RDI..R12 are not on the stack and
++ * space has not been allocated for them.)
++ */
++ENTRY(switch_to_thread_stack)
++	UNWIND_HINT_FUNC
++
++	pushq	%rdi
++	movq	%rsp, %rdi
++	movq	PER_CPU_VAR(cpu_current_top_of_stack), %rsp
++	UNWIND_HINT sp_offset=16 sp_reg=ORC_REG_DI
++
++	pushq	7*8(%rdi)		/* regs->ss */
++	pushq	6*8(%rdi)		/* regs->rsp */
++	pushq	5*8(%rdi)		/* regs->eflags */
++	pushq	4*8(%rdi)		/* regs->cs */
++	pushq	3*8(%rdi)		/* regs->ip */
++	pushq	2*8(%rdi)		/* regs->orig_ax */
++	pushq	8(%rdi)			/* return address */
++	UNWIND_HINT_FUNC
++
++	movq	(%rdi), %rdi
++	ret
++END(switch_to_thread_stack)
+ 
+ .macro idtentry sym do_sym has_error_code:req paranoid=0 shift_ist=-1
+ ENTRY(\sym)
+@@ -818,17 +973,18 @@ ENTRY(\sym)
+ 
+ 	ASM_CLAC
+ 
+-	.ifeq \has_error_code
++	.if \has_error_code == 0
+ 	pushq	$-1				/* ORIG_RAX: no syscall to restart */
+ 	.endif
+ 
+ 	ALLOC_PT_GPREGS_ON_STACK
+ 
+-	.if \paranoid
+-	.if \paranoid == 1
++	.if \paranoid < 2
+ 	testb	$3, CS(%rsp)			/* If coming from userspace, switch stacks */
+-	jnz	1f
++	jnz	.Lfrom_usermode_switch_stack_\@
+ 	.endif
++
++	.if \paranoid
+ 	call	paranoid_entry
+ 	.else
+ 	call	error_entry
+@@ -870,20 +1026,15 @@ ENTRY(\sym)
+ 	jmp	error_exit
+ 	.endif
+ 
+-	.if \paranoid == 1
++	.if \paranoid < 2
+ 	/*
+-	 * Paranoid entry from userspace.  Switch stacks and treat it
++	 * Entry from userspace.  Switch stacks and treat it
+ 	 * as a normal entry.  This means that paranoid handlers
+ 	 * run in real process context if user_mode(regs).
+ 	 */
+-1:
++.Lfrom_usermode_switch_stack_\@:
+ 	call	error_entry
+ 
+-
+-	movq	%rsp, %rdi			/* pt_regs pointer */
+-	call	sync_regs
+-	movq	%rax, %rsp			/* switch stack */
+-
+ 	movq	%rsp, %rdi			/* pt_regs pointer */
+ 
+ 	.if \has_error_code
+@@ -1059,6 +1210,7 @@ idtentry int3			do_int3			has_error_code=0	paranoid=1 shift_ist=DEBUG_STACK
+ idtentry stack_segment		do_stack_segment	has_error_code=1
+ 
+ #ifdef CONFIG_XEN
++idtentry xennmi			do_nmi			has_error_code=0
+ idtentry xendebug		do_debug		has_error_code=0
+ idtentry xenint3		do_int3			has_error_code=0
+ #endif
+@@ -1112,17 +1264,14 @@ ENTRY(paranoid_exit)
+ 	DISABLE_INTERRUPTS(CLBR_ANY)
+ 	TRACE_IRQS_OFF_DEBUG
+ 	testl	%ebx, %ebx			/* swapgs needed? */
+-	jnz	paranoid_exit_no_swapgs
++	jnz	.Lparanoid_exit_no_swapgs
+ 	TRACE_IRQS_IRETQ
+ 	SWAPGS_UNSAFE_STACK
+-	jmp	paranoid_exit_restore
+-paranoid_exit_no_swapgs:
++	jmp	.Lparanoid_exit_restore
++.Lparanoid_exit_no_swapgs:
+ 	TRACE_IRQS_IRETQ_DEBUG
+-paranoid_exit_restore:
+-	RESTORE_EXTRA_REGS
+-	RESTORE_C_REGS
+-	REMOVE_PT_GPREGS_FROM_STACK 8
+-	INTERRUPT_RETURN
++.Lparanoid_exit_restore:
++	jmp restore_regs_and_return_to_kernel
+ END(paranoid_exit)
+ 
+ /*
+@@ -1146,6 +1295,14 @@ ENTRY(error_entry)
+ 	SWAPGS
+ 
+ .Lerror_entry_from_usermode_after_swapgs:
++	/* Put us onto the real thread stack. */
++	popq	%r12				/* save return addr in %12 */
++	movq	%rsp, %rdi			/* arg0 = pt_regs pointer */
++	call	sync_regs
++	movq	%rax, %rsp			/* switch stack */
++	ENCODE_FRAME_POINTER
++	pushq	%r12
++
+ 	/*
+ 	 * We need to tell lockdep that IRQs are off.  We can't do this until
+ 	 * we fix gsbase, and we should do it before enter_from_user_mode
+@@ -1223,10 +1380,13 @@ ENTRY(error_exit)
+ 	jmp	retint_user
+ END(error_exit)
+ 
+-/* Runs on exception stack */
+-/* XXX: broken on Xen PV */
++/*
++ * Runs on exception stack.  Xen PV does not go through this path at all,
++ * so we can use real assembly here.
++ */
+ ENTRY(nmi)
+ 	UNWIND_HINT_IRET_REGS
++
+ 	/*
+ 	 * We allow breakpoints in NMIs. If a breakpoint occurs, then
+ 	 * the iretq it performs will take us out of NMI context.
+@@ -1284,7 +1444,7 @@ ENTRY(nmi)
+ 	 * stacks lest we corrupt the "NMI executing" variable.
+ 	 */
+ 
+-	SWAPGS_UNSAFE_STACK
++	swapgs
+ 	cld
+ 	movq	%rsp, %rdx
+ 	movq	PER_CPU_VAR(cpu_current_top_of_stack), %rsp
+@@ -1328,8 +1488,7 @@ ENTRY(nmi)
+ 	 * Return back to user mode.  We must *not* do the normal exit
+ 	 * work, because we don't want to enable interrupts.
+ 	 */
+-	SWAPGS
+-	jmp	restore_regs_and_iret
++	jmp	swapgs_restore_regs_and_return_to_usermode
+ 
+ .Lnmi_from_kernel:
+ 	/*
+@@ -1450,7 +1609,7 @@ nested_nmi_out:
+ 	popq	%rdx
+ 
+ 	/* We are returning to kernel mode, so this cannot result in a fault. */
+-	INTERRUPT_RETURN
++	iretq
+ 
+ first_nmi:
+ 	/* Restore rdx. */
+@@ -1481,7 +1640,7 @@ first_nmi:
+ 	pushfq			/* RFLAGS */
+ 	pushq	$__KERNEL_CS	/* CS */
+ 	pushq	$1f		/* RIP */
+-	INTERRUPT_RETURN	/* continues at repeat_nmi below */
++	iretq			/* continues at repeat_nmi below */
+ 	UNWIND_HINT_IRET_REGS
+ 1:
+ #endif
+@@ -1544,29 +1703,34 @@ end_repeat_nmi:
+ nmi_swapgs:
+ 	SWAPGS_UNSAFE_STACK
+ nmi_restore:
+-	RESTORE_EXTRA_REGS
+-	RESTORE_C_REGS
++	POP_EXTRA_REGS
++	POP_C_REGS
+ 
+-	/* Point RSP at the "iret" frame. */
+-	REMOVE_PT_GPREGS_FROM_STACK 6*8
++	/*
++	 * Skip orig_ax and the "outermost" frame to point RSP at the "iret"
++	 * at the "iret" frame.
++	 */
++	addq	$6*8, %rsp
+ 
+ 	/*
+ 	 * Clear "NMI executing".  Set DF first so that we can easily
+ 	 * distinguish the remaining code between here and IRET from
+-	 * the SYSCALL entry and exit paths.  On a native kernel, we
+-	 * could just inspect RIP, but, on paravirt kernels,
+-	 * INTERRUPT_RETURN can translate into a jump into a
+-	 * hypercall page.
++	 * the SYSCALL entry and exit paths.
++	 *
++	 * We arguably should just inspect RIP instead, but I (Andy) wrote
++	 * this code when I had the misapprehension that Xen PV supported
++	 * NMIs, and Xen PV would break that approach.
+ 	 */
+ 	std
+ 	movq	$0, 5*8(%rsp)		/* clear "NMI executing" */
+ 
+ 	/*
+-	 * INTERRUPT_RETURN reads the "iret" frame and exits the NMI
+-	 * stack in a single instruction.  We are returning to kernel
+-	 * mode, so this cannot result in a fault.
++	 * iretq reads the "iret" frame and exits the NMI stack in a
++	 * single instruction.  We are returning to kernel mode, so this
++	 * cannot result in a fault.  Similarly, we don't need to worry
++	 * about espfix64 on the way back to kernel mode.
+ 	 */
+-	INTERRUPT_RETURN
++	iretq
+ END(nmi)
+ 
+ ENTRY(ignore_sysret)
+diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
+index b5c7a56ed256..95ad40eb7eff 100644
+--- a/arch/x86/entry/entry_64_compat.S
++++ b/arch/x86/entry/entry_64_compat.S
+@@ -48,7 +48,7 @@
+  */
+ ENTRY(entry_SYSENTER_compat)
+ 	/* Interrupts are off on entry. */
+-	SWAPGS_UNSAFE_STACK
++	SWAPGS
+ 	movq	PER_CPU_VAR(cpu_current_top_of_stack), %rsp
+ 
+ 	/*
+@@ -306,8 +306,11 @@ ENTRY(entry_INT80_compat)
+ 	 */
+ 	movl	%eax, %eax
+ 
+-	/* Construct struct pt_regs on stack (iret frame is already on stack) */
+ 	pushq	%rax			/* pt_regs->orig_ax */
++
++	/* switch to thread stack expects orig_ax to be pushed */
++	call	switch_to_thread_stack
++
+ 	pushq	%rdi			/* pt_regs->di */
+ 	pushq	%rsi			/* pt_regs->si */
+ 	pushq	%rdx			/* pt_regs->dx */
+@@ -337,8 +340,7 @@ ENTRY(entry_INT80_compat)
+ 
+ 	/* Go back to user mode. */
+ 	TRACE_IRQS_ON
+-	SWAPGS
+-	jmp	restore_regs_and_iret
++	jmp	swapgs_restore_regs_and_return_to_usermode
+ END(entry_INT80_compat)
+ 
+ ENTRY(stub32_clone)
+diff --git a/arch/x86/entry/syscalls/Makefile b/arch/x86/entry/syscalls/Makefile
+index 331f1dca5085..6fb9b57ed5ba 100644
+--- a/arch/x86/entry/syscalls/Makefile
++++ b/arch/x86/entry/syscalls/Makefile
+@@ -1,6 +1,6 @@
+ # SPDX-License-Identifier: GPL-2.0
+-out := $(obj)/../../include/generated/asm
+-uapi := $(obj)/../../include/generated/uapi/asm
++out := arch/$(SRCARCH)/include/generated/asm
++uapi := arch/$(SRCARCH)/include/generated/uapi/asm
+ 
+ # Create output directory if not already present
+ _dummy := $(shell [ -d '$(out)' ] || mkdir -p '$(out)') \
+diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
+index 80534d3c2480..589af1eec7c1 100644
+--- a/arch/x86/events/core.c
++++ b/arch/x86/events/core.c
+@@ -2371,7 +2371,7 @@ static unsigned long get_segment_base(unsigned int segment)
+ 		struct ldt_struct *ldt;
+ 
+ 		/* IRQs are off, so this synchronizes with smp_store_release */
+-		ldt = lockless_dereference(current->active_mm->context.ldt);
++		ldt = READ_ONCE(current->active_mm->context.ldt);
+ 		if (!ldt || idx >= ldt->nr_entries)
+ 			return 0;
+ 
+diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
+index f94855000d4e..09c26a4f139c 100644
+--- a/arch/x86/events/intel/core.c
++++ b/arch/x86/events/intel/core.c
+@@ -2958,6 +2958,10 @@ static unsigned long intel_pmu_free_running_flags(struct perf_event *event)
+ 
+ 	if (event->attr.use_clockid)
+ 		flags &= ~PERF_SAMPLE_TIME;
++	if (!event->attr.exclude_kernel)
++		flags &= ~PERF_SAMPLE_REGS_USER;
++	if (event->attr.sample_regs_user & ~PEBS_REGS)
++		flags &= ~(PERF_SAMPLE_REGS_USER | PERF_SAMPLE_REGS_INTR);
+ 	return flags;
+ }
+ 
+diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
+index 4196f81ec0e1..f7aaadf9331f 100644
+--- a/arch/x86/events/perf_event.h
++++ b/arch/x86/events/perf_event.h
+@@ -85,13 +85,15 @@ struct amd_nb {
+  * Flags PEBS can handle without an PMI.
+  *
+  * TID can only be handled by flushing at context switch.
++ * REGS_USER can be handled for events limited to ring 3.
+  *
+  */
+ #define PEBS_FREERUNNING_FLAGS \
+ 	(PERF_SAMPLE_IP | PERF_SAMPLE_TID | PERF_SAMPLE_ADDR | \
+ 	PERF_SAMPLE_ID | PERF_SAMPLE_CPU | PERF_SAMPLE_STREAM_ID | \
+ 	PERF_SAMPLE_DATA_SRC | PERF_SAMPLE_IDENTIFIER | \
+-	PERF_SAMPLE_TRANSACTION | PERF_SAMPLE_PHYS_ADDR)
++	PERF_SAMPLE_TRANSACTION | PERF_SAMPLE_PHYS_ADDR | \
++	PERF_SAMPLE_REGS_INTR | PERF_SAMPLE_REGS_USER)
+ 
+ /*
+  * A debug store configuration.
+@@ -110,6 +112,26 @@ struct debug_store {
+ 	u64	pebs_event_reset[MAX_PEBS_EVENTS];
+ };
+ 
++#define PEBS_REGS \
++	(PERF_REG_X86_AX | \
++	 PERF_REG_X86_BX | \
++	 PERF_REG_X86_CX | \
++	 PERF_REG_X86_DX | \
++	 PERF_REG_X86_DI | \
++	 PERF_REG_X86_SI | \
++	 PERF_REG_X86_SP | \
++	 PERF_REG_X86_BP | \
++	 PERF_REG_X86_IP | \
++	 PERF_REG_X86_FLAGS | \
++	 PERF_REG_X86_R8 | \
++	 PERF_REG_X86_R9 | \
++	 PERF_REG_X86_R10 | \
++	 PERF_REG_X86_R11 | \
++	 PERF_REG_X86_R12 | \
++	 PERF_REG_X86_R13 | \
++	 PERF_REG_X86_R14 | \
++	 PERF_REG_X86_R15)
++
+ /*
+  * Per register state.
+  */
+diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
+index a5db63f728a2..a0b86cf486e0 100644
+--- a/arch/x86/hyperv/hv_init.c
++++ b/arch/x86/hyperv/hv_init.c
+@@ -113,7 +113,7 @@ void hyperv_init(void)
+ 	u64 guest_id;
+ 	union hv_x64_msr_hypercall_contents hypercall_msr;
+ 
+-	if (x86_hyper != &x86_hyper_ms_hyperv)
++	if (x86_hyper_type != X86_HYPER_MS_HYPERV)
+ 		return;
+ 
+ 	/* Allocate percpu VP index */
+diff --git a/arch/x86/include/asm/archrandom.h b/arch/x86/include/asm/archrandom.h
+index 5b0579abb398..3ac991d81e74 100644
+--- a/arch/x86/include/asm/archrandom.h
++++ b/arch/x86/include/asm/archrandom.h
+@@ -45,7 +45,7 @@ static inline bool rdrand_long(unsigned long *v)
+ 	bool ok;
+ 	unsigned int retry = RDRAND_RETRY_LOOPS;
+ 	do {
+-		asm volatile(RDRAND_LONG "\n\t"
++		asm volatile(RDRAND_LONG
+ 			     CC_SET(c)
+ 			     : CC_OUT(c) (ok), "=a" (*v));
+ 		if (ok)
+@@ -59,7 +59,7 @@ static inline bool rdrand_int(unsigned int *v)
+ 	bool ok;
+ 	unsigned int retry = RDRAND_RETRY_LOOPS;
+ 	do {
+-		asm volatile(RDRAND_INT "\n\t"
++		asm volatile(RDRAND_INT
+ 			     CC_SET(c)
+ 			     : CC_OUT(c) (ok), "=a" (*v));
+ 		if (ok)
+@@ -71,7 +71,7 @@ static inline bool rdrand_int(unsigned int *v)
+ static inline bool rdseed_long(unsigned long *v)
+ {
+ 	bool ok;
+-	asm volatile(RDSEED_LONG "\n\t"
++	asm volatile(RDSEED_LONG
+ 		     CC_SET(c)
+ 		     : CC_OUT(c) (ok), "=a" (*v));
+ 	return ok;
+@@ -80,7 +80,7 @@ static inline bool rdseed_long(unsigned long *v)
+ static inline bool rdseed_int(unsigned int *v)
+ {
+ 	bool ok;
+-	asm volatile(RDSEED_INT "\n\t"
++	asm volatile(RDSEED_INT
+ 		     CC_SET(c)
+ 		     : CC_OUT(c) (ok), "=a" (*v));
+ 	return ok;
+diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
+index 2bcf47314959..3fa039855b8f 100644
+--- a/arch/x86/include/asm/bitops.h
++++ b/arch/x86/include/asm/bitops.h
+@@ -143,7 +143,7 @@ static __always_inline void __clear_bit(long nr, volatile unsigned long *addr)
+ static __always_inline bool clear_bit_unlock_is_negative_byte(long nr, volatile unsigned long *addr)
+ {
+ 	bool negative;
+-	asm volatile(LOCK_PREFIX "andb %2,%1\n\t"
++	asm volatile(LOCK_PREFIX "andb %2,%1"
+ 		CC_SET(s)
+ 		: CC_OUT(s) (negative), ADDR
+ 		: "ir" ((char) ~(1 << nr)) : "memory");
+@@ -246,7 +246,7 @@ static __always_inline bool __test_and_set_bit(long nr, volatile unsigned long *
+ {
+ 	bool oldbit;
+ 
+-	asm("bts %2,%1\n\t"
++	asm("bts %2,%1"
+ 	    CC_SET(c)
+ 	    : CC_OUT(c) (oldbit), ADDR
+ 	    : "Ir" (nr));
+@@ -286,7 +286,7 @@ static __always_inline bool __test_and_clear_bit(long nr, volatile unsigned long
+ {
+ 	bool oldbit;
+ 
+-	asm volatile("btr %2,%1\n\t"
++	asm volatile("btr %2,%1"
+ 		     CC_SET(c)
+ 		     : CC_OUT(c) (oldbit), ADDR
+ 		     : "Ir" (nr));
+@@ -298,7 +298,7 @@ static __always_inline bool __test_and_change_bit(long nr, volatile unsigned lon
+ {
+ 	bool oldbit;
+ 
+-	asm volatile("btc %2,%1\n\t"
++	asm volatile("btc %2,%1"
+ 		     CC_SET(c)
+ 		     : CC_OUT(c) (oldbit), ADDR
+ 		     : "Ir" (nr) : "memory");
+@@ -329,7 +329,7 @@ static __always_inline bool variable_test_bit(long nr, volatile const unsigned l
+ {
+ 	bool oldbit;
+ 
+-	asm volatile("bt %2,%1\n\t"
++	asm volatile("bt %2,%1"
+ 		     CC_SET(c)
+ 		     : CC_OUT(c) (oldbit)
+ 		     : "m" (*(unsigned long *)addr), "Ir" (nr));
+diff --git a/arch/x86/include/asm/compat.h b/arch/x86/include/asm/compat.h
+index 70bc1df580b2..2cbd75dd2fd3 100644
+--- a/arch/x86/include/asm/compat.h
++++ b/arch/x86/include/asm/compat.h
+@@ -7,6 +7,7 @@
+  */
+ #include <linux/types.h>
+ #include <linux/sched.h>
++#include <linux/sched/task_stack.h>
+ #include <asm/processor.h>
+ #include <asm/user32.h>
+ #include <asm/unistd.h>
+diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
+index 0dfa68438e80..ea9a7dde62e5 100644
+--- a/arch/x86/include/asm/cpufeature.h
++++ b/arch/x86/include/asm/cpufeature.h
+@@ -126,16 +126,17 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
+ #define boot_cpu_has(bit)	cpu_has(&boot_cpu_data, bit)
+ 
+ #define set_cpu_cap(c, bit)	set_bit(bit, (unsigned long *)((c)->x86_capability))
+-#define clear_cpu_cap(c, bit)	clear_bit(bit, (unsigned long *)((c)->x86_capability))
+-#define setup_clear_cpu_cap(bit) do { \
+-	clear_cpu_cap(&boot_cpu_data, bit);	\
+-	set_bit(bit, (unsigned long *)cpu_caps_cleared); \
+-} while (0)
++
++extern void setup_clear_cpu_cap(unsigned int bit);
++extern void clear_cpu_cap(struct cpuinfo_x86 *c, unsigned int bit);
++
+ #define setup_force_cpu_cap(bit) do { \
+ 	set_cpu_cap(&boot_cpu_data, bit);	\
+ 	set_bit(bit, (unsigned long *)cpu_caps_set);	\
+ } while (0)
+ 
++#define setup_force_cpu_bug(bit) setup_force_cpu_cap(bit)
++
+ #if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_X86_FAST_FEATURE_TESTS)
+ /*
+  * Static testing of CPU features.  Used the same as boot_cpu_has().
+diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
+index 793690fbda36..800104c8a3ed 100644
+--- a/arch/x86/include/asm/cpufeatures.h
++++ b/arch/x86/include/asm/cpufeatures.h
+@@ -13,173 +13,176 @@
+ /*
+  * Defines x86 CPU feature bits
+  */
+-#define NCAPINTS	18	/* N 32-bit words worth of info */
+-#define NBUGINTS	1	/* N 32-bit bug flags */
++#define NCAPINTS			18	   /* N 32-bit words worth of info */
++#define NBUGINTS			1	   /* N 32-bit bug flags */
+ 
+ /*
+  * Note: If the comment begins with a quoted string, that string is used
+  * in /proc/cpuinfo instead of the macro name.  If the string is "",
+  * this feature bit is not displayed in /proc/cpuinfo at all.
++ *
++ * When adding new features here that depend on other features,
++ * please update the table in kernel/cpu/cpuid-deps.c as well.
+  */
+ 
+-/* Intel-defined CPU features, CPUID level 0x00000001 (edx), word 0 */
+-#define X86_FEATURE_FPU		( 0*32+ 0) /* Onboard FPU */
+-#define X86_FEATURE_VME		( 0*32+ 1) /* Virtual Mode Extensions */
+-#define X86_FEATURE_DE		( 0*32+ 2) /* Debugging Extensions */
+-#define X86_FEATURE_PSE		( 0*32+ 3) /* Page Size Extensions */
+-#define X86_FEATURE_TSC		( 0*32+ 4) /* Time Stamp Counter */
+-#define X86_FEATURE_MSR		( 0*32+ 5) /* Model-Specific Registers */
+-#define X86_FEATURE_PAE		( 0*32+ 6) /* Physical Address Extensions */
+-#define X86_FEATURE_MCE		( 0*32+ 7) /* Machine Check Exception */
+-#define X86_FEATURE_CX8		( 0*32+ 8) /* CMPXCHG8 instruction */
+-#define X86_FEATURE_APIC	( 0*32+ 9) /* Onboard APIC */
+-#define X86_FEATURE_SEP		( 0*32+11) /* SYSENTER/SYSEXIT */
+-#define X86_FEATURE_MTRR	( 0*32+12) /* Memory Type Range Registers */
+-#define X86_FEATURE_PGE		( 0*32+13) /* Page Global Enable */
+-#define X86_FEATURE_MCA		( 0*32+14) /* Machine Check Architecture */
+-#define X86_FEATURE_CMOV	( 0*32+15) /* CMOV instructions */
+-					  /* (plus FCMOVcc, FCOMI with FPU) */
+-#define X86_FEATURE_PAT		( 0*32+16) /* Page Attribute Table */
+-#define X86_FEATURE_PSE36	( 0*32+17) /* 36-bit PSEs */
+-#define X86_FEATURE_PN		( 0*32+18) /* Processor serial number */
+-#define X86_FEATURE_CLFLUSH	( 0*32+19) /* CLFLUSH instruction */
+-#define X86_FEATURE_DS		( 0*32+21) /* "dts" Debug Store */
+-#define X86_FEATURE_ACPI	( 0*32+22) /* ACPI via MSR */
+-#define X86_FEATURE_MMX		( 0*32+23) /* Multimedia Extensions */
+-#define X86_FEATURE_FXSR	( 0*32+24) /* FXSAVE/FXRSTOR, CR4.OSFXSR */
+-#define X86_FEATURE_XMM		( 0*32+25) /* "sse" */
+-#define X86_FEATURE_XMM2	( 0*32+26) /* "sse2" */
+-#define X86_FEATURE_SELFSNOOP	( 0*32+27) /* "ss" CPU self snoop */
+-#define X86_FEATURE_HT		( 0*32+28) /* Hyper-Threading */
+-#define X86_FEATURE_ACC		( 0*32+29) /* "tm" Automatic clock control */
+-#define X86_FEATURE_IA64	( 0*32+30) /* IA-64 processor */
+-#define X86_FEATURE_PBE		( 0*32+31) /* Pending Break Enable */
++/* Intel-defined CPU features, CPUID level 0x00000001 (EDX), word 0 */
++#define X86_FEATURE_FPU			( 0*32+ 0) /* Onboard FPU */
++#define X86_FEATURE_VME			( 0*32+ 1) /* Virtual Mode Extensions */
++#define X86_FEATURE_DE			( 0*32+ 2) /* Debugging Extensions */
++#define X86_FEATURE_PSE			( 0*32+ 3) /* Page Size Extensions */
++#define X86_FEATURE_TSC			( 0*32+ 4) /* Time Stamp Counter */
++#define X86_FEATURE_MSR			( 0*32+ 5) /* Model-Specific Registers */
++#define X86_FEATURE_PAE			( 0*32+ 6) /* Physical Address Extensions */
++#define X86_FEATURE_MCE			( 0*32+ 7) /* Machine Check Exception */
++#define X86_FEATURE_CX8			( 0*32+ 8) /* CMPXCHG8 instruction */
++#define X86_FEATURE_APIC		( 0*32+ 9) /* Onboard APIC */
++#define X86_FEATURE_SEP			( 0*32+11) /* SYSENTER/SYSEXIT */
++#define X86_FEATURE_MTRR		( 0*32+12) /* Memory Type Range Registers */
++#define X86_FEATURE_PGE			( 0*32+13) /* Page Global Enable */
++#define X86_FEATURE_MCA			( 0*32+14) /* Machine Check Architecture */
++#define X86_FEATURE_CMOV		( 0*32+15) /* CMOV instructions (plus FCMOVcc, FCOMI with FPU) */
++#define X86_FEATURE_PAT			( 0*32+16) /* Page Attribute Table */
++#define X86_FEATURE_PSE36		( 0*32+17) /* 36-bit PSEs */
++#define X86_FEATURE_PN			( 0*32+18) /* Processor serial number */
++#define X86_FEATURE_CLFLUSH		( 0*32+19) /* CLFLUSH instruction */
++#define X86_FEATURE_DS			( 0*32+21) /* "dts" Debug Store */
++#define X86_FEATURE_ACPI		( 0*32+22) /* ACPI via MSR */
++#define X86_FEATURE_MMX			( 0*32+23) /* Multimedia Extensions */
++#define X86_FEATURE_FXSR		( 0*32+24) /* FXSAVE/FXRSTOR, CR4.OSFXSR */
++#define X86_FEATURE_XMM			( 0*32+25) /* "sse" */
++#define X86_FEATURE_XMM2		( 0*32+26) /* "sse2" */
++#define X86_FEATURE_SELFSNOOP		( 0*32+27) /* "ss" CPU self snoop */
++#define X86_FEATURE_HT			( 0*32+28) /* Hyper-Threading */
++#define X86_FEATURE_ACC			( 0*32+29) /* "tm" Automatic clock control */
++#define X86_FEATURE_IA64		( 0*32+30) /* IA-64 processor */
++#define X86_FEATURE_PBE			( 0*32+31) /* Pending Break Enable */
+ 
+ /* AMD-defined CPU features, CPUID level 0x80000001, word 1 */
+ /* Don't duplicate feature flags which are redundant with Intel! */
+-#define X86_FEATURE_SYSCALL	( 1*32+11) /* SYSCALL/SYSRET */
+-#define X86_FEATURE_MP		( 1*32+19) /* MP Capable. */
+-#define X86_FEATURE_NX		( 1*32+20) /* Execute Disable */
+-#define X86_FEATURE_MMXEXT	( 1*32+22) /* AMD MMX extensions */
+-#define X86_FEATURE_FXSR_OPT	( 1*32+25) /* FXSAVE/FXRSTOR optimizations */
+-#define X86_FEATURE_GBPAGES	( 1*32+26) /* "pdpe1gb" GB pages */
+-#define X86_FEATURE_RDTSCP	( 1*32+27) /* RDTSCP */
+-#define X86_FEATURE_LM		( 1*32+29) /* Long Mode (x86-64) */
+-#define X86_FEATURE_3DNOWEXT	( 1*32+30) /* AMD 3DNow! extensions */
+-#define X86_FEATURE_3DNOW	( 1*32+31) /* 3DNow! */
++#define X86_FEATURE_SYSCALL		( 1*32+11) /* SYSCALL/SYSRET */
++#define X86_FEATURE_MP			( 1*32+19) /* MP Capable */
++#define X86_FEATURE_NX			( 1*32+20) /* Execute Disable */
++#define X86_FEATURE_MMXEXT		( 1*32+22) /* AMD MMX extensions */
++#define X86_FEATURE_FXSR_OPT		( 1*32+25) /* FXSAVE/FXRSTOR optimizations */
++#define X86_FEATURE_GBPAGES		( 1*32+26) /* "pdpe1gb" GB pages */
++#define X86_FEATURE_RDTSCP		( 1*32+27) /* RDTSCP */
++#define X86_FEATURE_LM			( 1*32+29) /* Long Mode (x86-64, 64-bit support) */
++#define X86_FEATURE_3DNOWEXT		( 1*32+30) /* AMD 3DNow extensions */
++#define X86_FEATURE_3DNOW		( 1*32+31) /* 3DNow */
+ 
+ /* Transmeta-defined CPU features, CPUID level 0x80860001, word 2 */
+-#define X86_FEATURE_RECOVERY	( 2*32+ 0) /* CPU in recovery mode */
+-#define X86_FEATURE_LONGRUN	( 2*32+ 1) /* Longrun power control */
+-#define X86_FEATURE_LRTI	( 2*32+ 3) /* LongRun table interface */
++#define X86_FEATURE_RECOVERY		( 2*32+ 0) /* CPU in recovery mode */
++#define X86_FEATURE_LONGRUN		( 2*32+ 1) /* Longrun power control */
++#define X86_FEATURE_LRTI		( 2*32+ 3) /* LongRun table interface */
+ 
+ /* Other features, Linux-defined mapping, word 3 */
+ /* This range is used for feature bits which conflict or are synthesized */
+-#define X86_FEATURE_CXMMX	( 3*32+ 0) /* Cyrix MMX extensions */
+-#define X86_FEATURE_K6_MTRR	( 3*32+ 1) /* AMD K6 nonstandard MTRRs */
+-#define X86_FEATURE_CYRIX_ARR	( 3*32+ 2) /* Cyrix ARRs (= MTRRs) */
+-#define X86_FEATURE_CENTAUR_MCR	( 3*32+ 3) /* Centaur MCRs (= MTRRs) */
+-/* cpu types for specific tunings: */
+-#define X86_FEATURE_K8		( 3*32+ 4) /* "" Opteron, Athlon64 */
+-#define X86_FEATURE_K7		( 3*32+ 5) /* "" Athlon */
+-#define X86_FEATURE_P3		( 3*32+ 6) /* "" P3 */
+-#define X86_FEATURE_P4		( 3*32+ 7) /* "" P4 */
+-#define X86_FEATURE_CONSTANT_TSC ( 3*32+ 8) /* TSC ticks at a constant rate */
+-#define X86_FEATURE_UP		( 3*32+ 9) /* smp kernel running on up */
+-#define X86_FEATURE_ART		( 3*32+10) /* Platform has always running timer (ART) */
+-#define X86_FEATURE_ARCH_PERFMON ( 3*32+11) /* Intel Architectural PerfMon */
+-#define X86_FEATURE_PEBS	( 3*32+12) /* Precise-Event Based Sampling */
+-#define X86_FEATURE_BTS		( 3*32+13) /* Branch Trace Store */
+-#define X86_FEATURE_SYSCALL32	( 3*32+14) /* "" syscall in ia32 userspace */
+-#define X86_FEATURE_SYSENTER32	( 3*32+15) /* "" sysenter in ia32 userspace */
+-#define X86_FEATURE_REP_GOOD	( 3*32+16) /* rep microcode works well */
+-#define X86_FEATURE_MFENCE_RDTSC ( 3*32+17) /* "" Mfence synchronizes RDTSC */
+-#define X86_FEATURE_LFENCE_RDTSC ( 3*32+18) /* "" Lfence synchronizes RDTSC */
+-#define X86_FEATURE_ACC_POWER	( 3*32+19) /* AMD Accumulated Power Mechanism */
+-#define X86_FEATURE_NOPL	( 3*32+20) /* The NOPL (0F 1F) instructions */
+-#define X86_FEATURE_ALWAYS	( 3*32+21) /* "" Always-present feature */
+-#define X86_FEATURE_XTOPOLOGY	( 3*32+22) /* cpu topology enum extensions */
+-#define X86_FEATURE_TSC_RELIABLE ( 3*32+23) /* TSC is known to be reliable */
+-#define X86_FEATURE_NONSTOP_TSC	( 3*32+24) /* TSC does not stop in C states */
+-#define X86_FEATURE_CPUID	( 3*32+25) /* CPU has CPUID instruction itself */
+-#define X86_FEATURE_EXTD_APICID	( 3*32+26) /* has extended APICID (8 bits) */
+-#define X86_FEATURE_AMD_DCM     ( 3*32+27) /* multi-node processor */
+-#define X86_FEATURE_APERFMPERF	( 3*32+28) /* APERFMPERF */
+-#define X86_FEATURE_NONSTOP_TSC_S3 ( 3*32+30) /* TSC doesn't stop in S3 state */
+-#define X86_FEATURE_TSC_KNOWN_FREQ ( 3*32+31) /* TSC has known frequency */
++#define X86_FEATURE_CXMMX		( 3*32+ 0) /* Cyrix MMX extensions */
++#define X86_FEATURE_K6_MTRR		( 3*32+ 1) /* AMD K6 nonstandard MTRRs */
++#define X86_FEATURE_CYRIX_ARR		( 3*32+ 2) /* Cyrix ARRs (= MTRRs) */
++#define X86_FEATURE_CENTAUR_MCR		( 3*32+ 3) /* Centaur MCRs (= MTRRs) */
++
++/* CPU types for specific tunings: */
++#define X86_FEATURE_K8			( 3*32+ 4) /* "" Opteron, Athlon64 */
++#define X86_FEATURE_K7			( 3*32+ 5) /* "" Athlon */
++#define X86_FEATURE_P3			( 3*32+ 6) /* "" P3 */
++#define X86_FEATURE_P4			( 3*32+ 7) /* "" P4 */
++#define X86_FEATURE_CONSTANT_TSC	( 3*32+ 8) /* TSC ticks at a constant rate */
++#define X86_FEATURE_UP			( 3*32+ 9) /* SMP kernel running on UP */
++#define X86_FEATURE_ART			( 3*32+10) /* Always running timer (ART) */
++#define X86_FEATURE_ARCH_PERFMON	( 3*32+11) /* Intel Architectural PerfMon */
++#define X86_FEATURE_PEBS		( 3*32+12) /* Precise-Event Based Sampling */
++#define X86_FEATURE_BTS			( 3*32+13) /* Branch Trace Store */
++#define X86_FEATURE_SYSCALL32		( 3*32+14) /* "" syscall in IA32 userspace */
++#define X86_FEATURE_SYSENTER32		( 3*32+15) /* "" sysenter in IA32 userspace */
++#define X86_FEATURE_REP_GOOD		( 3*32+16) /* REP microcode works well */
++#define X86_FEATURE_MFENCE_RDTSC	( 3*32+17) /* "" MFENCE synchronizes RDTSC */
++#define X86_FEATURE_LFENCE_RDTSC	( 3*32+18) /* "" LFENCE synchronizes RDTSC */
++#define X86_FEATURE_ACC_POWER		( 3*32+19) /* AMD Accumulated Power Mechanism */
++#define X86_FEATURE_NOPL		( 3*32+20) /* The NOPL (0F 1F) instructions */
++#define X86_FEATURE_ALWAYS		( 3*32+21) /* "" Always-present feature */
++#define X86_FEATURE_XTOPOLOGY		( 3*32+22) /* CPU topology enum extensions */
++#define X86_FEATURE_TSC_RELIABLE	( 3*32+23) /* TSC is known to be reliable */
++#define X86_FEATURE_NONSTOP_TSC		( 3*32+24) /* TSC does not stop in C states */
++#define X86_FEATURE_CPUID		( 3*32+25) /* CPU has CPUID instruction itself */
++#define X86_FEATURE_EXTD_APICID		( 3*32+26) /* Extended APICID (8 bits) */
++#define X86_FEATURE_AMD_DCM		( 3*32+27) /* AMD multi-node processor */
++#define X86_FEATURE_APERFMPERF		( 3*32+28) /* P-State hardware coordination feedback capability (APERF/MPERF MSRs) */
++#define X86_FEATURE_NONSTOP_TSC_S3	( 3*32+30) /* TSC doesn't stop in S3 state */
++#define X86_FEATURE_TSC_KNOWN_FREQ	( 3*32+31) /* TSC has known frequency */
+ 
+-/* Intel-defined CPU features, CPUID level 0x00000001 (ecx), word 4 */
+-#define X86_FEATURE_XMM3	( 4*32+ 0) /* "pni" SSE-3 */
+-#define X86_FEATURE_PCLMULQDQ	( 4*32+ 1) /* PCLMULQDQ instruction */
+-#define X86_FEATURE_DTES64	( 4*32+ 2) /* 64-bit Debug Store */
+-#define X86_FEATURE_MWAIT	( 4*32+ 3) /* "monitor" Monitor/Mwait support */
+-#define X86_FEATURE_DSCPL	( 4*32+ 4) /* "ds_cpl" CPL Qual. Debug Store */
+-#define X86_FEATURE_VMX		( 4*32+ 5) /* Hardware virtualization */
+-#define X86_FEATURE_SMX		( 4*32+ 6) /* Safer mode */
+-#define X86_FEATURE_EST		( 4*32+ 7) /* Enhanced SpeedStep */
+-#define X86_FEATURE_TM2		( 4*32+ 8) /* Thermal Monitor 2 */
+-#define X86_FEATURE_SSSE3	( 4*32+ 9) /* Supplemental SSE-3 */
+-#define X86_FEATURE_CID		( 4*32+10) /* Context ID */
+-#define X86_FEATURE_SDBG	( 4*32+11) /* Silicon Debug */
+-#define X86_FEATURE_FMA		( 4*32+12) /* Fused multiply-add */
+-#define X86_FEATURE_CX16	( 4*32+13) /* CMPXCHG16B */
+-#define X86_FEATURE_XTPR	( 4*32+14) /* Send Task Priority Messages */
+-#define X86_FEATURE_PDCM	( 4*32+15) /* Performance Capabilities */
+-#define X86_FEATURE_PCID	( 4*32+17) /* Process Context Identifiers */
+-#define X86_FEATURE_DCA		( 4*32+18) /* Direct Cache Access */
+-#define X86_FEATURE_XMM4_1	( 4*32+19) /* "sse4_1" SSE-4.1 */
+-#define X86_FEATURE_XMM4_2	( 4*32+20) /* "sse4_2" SSE-4.2 */
+-#define X86_FEATURE_X2APIC	( 4*32+21) /* x2APIC */
+-#define X86_FEATURE_MOVBE	( 4*32+22) /* MOVBE instruction */
+-#define X86_FEATURE_POPCNT      ( 4*32+23) /* POPCNT instruction */
+-#define X86_FEATURE_TSC_DEADLINE_TIMER	( 4*32+24) /* Tsc deadline timer */
+-#define X86_FEATURE_AES		( 4*32+25) /* AES instructions */
+-#define X86_FEATURE_XSAVE	( 4*32+26) /* XSAVE/XRSTOR/XSETBV/XGETBV */
+-#define X86_FEATURE_OSXSAVE	( 4*32+27) /* "" XSAVE enabled in the OS */
+-#define X86_FEATURE_AVX		( 4*32+28) /* Advanced Vector Extensions */
+-#define X86_FEATURE_F16C	( 4*32+29) /* 16-bit fp conversions */
+-#define X86_FEATURE_RDRAND	( 4*32+30) /* The RDRAND instruction */
+-#define X86_FEATURE_HYPERVISOR	( 4*32+31) /* Running on a hypervisor */
++/* Intel-defined CPU features, CPUID level 0x00000001 (ECX), word 4 */
++#define X86_FEATURE_XMM3		( 4*32+ 0) /* "pni" SSE-3 */
++#define X86_FEATURE_PCLMULQDQ		( 4*32+ 1) /* PCLMULQDQ instruction */
++#define X86_FEATURE_DTES64		( 4*32+ 2) /* 64-bit Debug Store */
++#define X86_FEATURE_MWAIT		( 4*32+ 3) /* "monitor" MONITOR/MWAIT support */
++#define X86_FEATURE_DSCPL		( 4*32+ 4) /* "ds_cpl" CPL-qualified (filtered) Debug Store */
++#define X86_FEATURE_VMX			( 4*32+ 5) /* Hardware virtualization */
++#define X86_FEATURE_SMX			( 4*32+ 6) /* Safer Mode eXtensions */
++#define X86_FEATURE_EST			( 4*32+ 7) /* Enhanced SpeedStep */
++#define X86_FEATURE_TM2			( 4*32+ 8) /* Thermal Monitor 2 */
++#define X86_FEATURE_SSSE3		( 4*32+ 9) /* Supplemental SSE-3 */
++#define X86_FEATURE_CID			( 4*32+10) /* Context ID */
++#define X86_FEATURE_SDBG		( 4*32+11) /* Silicon Debug */
++#define X86_FEATURE_FMA			( 4*32+12) /* Fused multiply-add */
++#define X86_FEATURE_CX16		( 4*32+13) /* CMPXCHG16B instruction */
++#define X86_FEATURE_XTPR		( 4*32+14) /* Send Task Priority Messages */
++#define X86_FEATURE_PDCM		( 4*32+15) /* Perf/Debug Capabilities MSR */
++#define X86_FEATURE_PCID		( 4*32+17) /* Process Context Identifiers */
++#define X86_FEATURE_DCA			( 4*32+18) /* Direct Cache Access */
++#define X86_FEATURE_XMM4_1		( 4*32+19) /* "sse4_1" SSE-4.1 */
++#define X86_FEATURE_XMM4_2		( 4*32+20) /* "sse4_2" SSE-4.2 */
++#define X86_FEATURE_X2APIC		( 4*32+21) /* X2APIC */
++#define X86_FEATURE_MOVBE		( 4*32+22) /* MOVBE instruction */
++#define X86_FEATURE_POPCNT		( 4*32+23) /* POPCNT instruction */
++#define X86_FEATURE_TSC_DEADLINE_TIMER	( 4*32+24) /* TSC deadline timer */
++#define X86_FEATURE_AES			( 4*32+25) /* AES instructions */
++#define X86_FEATURE_XSAVE		( 4*32+26) /* XSAVE/XRSTOR/XSETBV/XGETBV instructions */
++#define X86_FEATURE_OSXSAVE		( 4*32+27) /* "" XSAVE instruction enabled in the OS */
++#define X86_FEATURE_AVX			( 4*32+28) /* Advanced Vector Extensions */
++#define X86_FEATURE_F16C		( 4*32+29) /* 16-bit FP conversions */
++#define X86_FEATURE_RDRAND		( 4*32+30) /* RDRAND instruction */
++#define X86_FEATURE_HYPERVISOR		( 4*32+31) /* Running on a hypervisor */
+ 
+ /* VIA/Cyrix/Centaur-defined CPU features, CPUID level 0xC0000001, word 5 */
+-#define X86_FEATURE_XSTORE	( 5*32+ 2) /* "rng" RNG present (xstore) */
+-#define X86_FEATURE_XSTORE_EN	( 5*32+ 3) /* "rng_en" RNG enabled */
+-#define X86_FEATURE_XCRYPT	( 5*32+ 6) /* "ace" on-CPU crypto (xcrypt) */
+-#define X86_FEATURE_XCRYPT_EN	( 5*32+ 7) /* "ace_en" on-CPU crypto enabled */
+-#define X86_FEATURE_ACE2	( 5*32+ 8) /* Advanced Cryptography Engine v2 */
+-#define X86_FEATURE_ACE2_EN	( 5*32+ 9) /* ACE v2 enabled */
+-#define X86_FEATURE_PHE		( 5*32+10) /* PadLock Hash Engine */
+-#define X86_FEATURE_PHE_EN	( 5*32+11) /* PHE enabled */
+-#define X86_FEATURE_PMM		( 5*32+12) /* PadLock Montgomery Multiplier */
+-#define X86_FEATURE_PMM_EN	( 5*32+13) /* PMM enabled */
++#define X86_FEATURE_XSTORE		( 5*32+ 2) /* "rng" RNG present (xstore) */
++#define X86_FEATURE_XSTORE_EN		( 5*32+ 3) /* "rng_en" RNG enabled */
++#define X86_FEATURE_XCRYPT		( 5*32+ 6) /* "ace" on-CPU crypto (xcrypt) */
++#define X86_FEATURE_XCRYPT_EN		( 5*32+ 7) /* "ace_en" on-CPU crypto enabled */
++#define X86_FEATURE_ACE2		( 5*32+ 8) /* Advanced Cryptography Engine v2 */
++#define X86_FEATURE_ACE2_EN		( 5*32+ 9) /* ACE v2 enabled */
++#define X86_FEATURE_PHE			( 5*32+10) /* PadLock Hash Engine */
++#define X86_FEATURE_PHE_EN		( 5*32+11) /* PHE enabled */
++#define X86_FEATURE_PMM			( 5*32+12) /* PadLock Montgomery Multiplier */
++#define X86_FEATURE_PMM_EN		( 5*32+13) /* PMM enabled */
+ 
+-/* More extended AMD flags: CPUID level 0x80000001, ecx, word 6 */
+-#define X86_FEATURE_LAHF_LM	( 6*32+ 0) /* LAHF/SAHF in long mode */
+-#define X86_FEATURE_CMP_LEGACY	( 6*32+ 1) /* If yes HyperThreading not valid */
+-#define X86_FEATURE_SVM		( 6*32+ 2) /* Secure virtual machine */
+-#define X86_FEATURE_EXTAPIC	( 6*32+ 3) /* Extended APIC space */
+-#define X86_FEATURE_CR8_LEGACY	( 6*32+ 4) /* CR8 in 32-bit mode */
+-#define X86_FEATURE_ABM		( 6*32+ 5) /* Advanced bit manipulation */
+-#define X86_FEATURE_SSE4A	( 6*32+ 6) /* SSE-4A */
+-#define X86_FEATURE_MISALIGNSSE ( 6*32+ 7) /* Misaligned SSE mode */
+-#define X86_FEATURE_3DNOWPREFETCH ( 6*32+ 8) /* 3DNow prefetch instructions */
+-#define X86_FEATURE_OSVW	( 6*32+ 9) /* OS Visible Workaround */
+-#define X86_FEATURE_IBS		( 6*32+10) /* Instruction Based Sampling */
+-#define X86_FEATURE_XOP		( 6*32+11) /* extended AVX instructions */
+-#define X86_FEATURE_SKINIT	( 6*32+12) /* SKINIT/STGI instructions */
+-#define X86_FEATURE_WDT		( 6*32+13) /* Watchdog timer */
+-#define X86_FEATURE_LWP		( 6*32+15) /* Light Weight Profiling */
+-#define X86_FEATURE_FMA4	( 6*32+16) /* 4 operands MAC instructions */
+-#define X86_FEATURE_TCE		( 6*32+17) /* translation cache extension */
+-#define X86_FEATURE_NODEID_MSR	( 6*32+19) /* NodeId MSR */
+-#define X86_FEATURE_TBM		( 6*32+21) /* trailing bit manipulations */
+-#define X86_FEATURE_TOPOEXT	( 6*32+22) /* topology extensions CPUID leafs */
+-#define X86_FEATURE_PERFCTR_CORE ( 6*32+23) /* core performance counter extensions */
+-#define X86_FEATURE_PERFCTR_NB  ( 6*32+24) /* NB performance counter extensions */
+-#define X86_FEATURE_BPEXT	(6*32+26) /* data breakpoint extension */
+-#define X86_FEATURE_PTSC	( 6*32+27) /* performance time-stamp counter */
+-#define X86_FEATURE_PERFCTR_LLC	( 6*32+28) /* Last Level Cache performance counter extensions */
+-#define X86_FEATURE_MWAITX	( 6*32+29) /* MWAIT extension (MONITORX/MWAITX) */
++/* More extended AMD flags: CPUID level 0x80000001, ECX, word 6 */
++#define X86_FEATURE_LAHF_LM		( 6*32+ 0) /* LAHF/SAHF in long mode */
++#define X86_FEATURE_CMP_LEGACY		( 6*32+ 1) /* If yes HyperThreading not valid */
++#define X86_FEATURE_SVM			( 6*32+ 2) /* Secure Virtual Machine */
++#define X86_FEATURE_EXTAPIC		( 6*32+ 3) /* Extended APIC space */
++#define X86_FEATURE_CR8_LEGACY		( 6*32+ 4) /* CR8 in 32-bit mode */
++#define X86_FEATURE_ABM			( 6*32+ 5) /* Advanced bit manipulation */
++#define X86_FEATURE_SSE4A		( 6*32+ 6) /* SSE-4A */
++#define X86_FEATURE_MISALIGNSSE		( 6*32+ 7) /* Misaligned SSE mode */
++#define X86_FEATURE_3DNOWPREFETCH	( 6*32+ 8) /* 3DNow prefetch instructions */
++#define X86_FEATURE_OSVW		( 6*32+ 9) /* OS Visible Workaround */
++#define X86_FEATURE_IBS			( 6*32+10) /* Instruction Based Sampling */
++#define X86_FEATURE_XOP			( 6*32+11) /* extended AVX instructions */
++#define X86_FEATURE_SKINIT		( 6*32+12) /* SKINIT/STGI instructions */
++#define X86_FEATURE_WDT			( 6*32+13) /* Watchdog timer */
++#define X86_FEATURE_LWP			( 6*32+15) /* Light Weight Profiling */
++#define X86_FEATURE_FMA4		( 6*32+16) /* 4 operands MAC instructions */
++#define X86_FEATURE_TCE			( 6*32+17) /* Translation Cache Extension */
++#define X86_FEATURE_NODEID_MSR		( 6*32+19) /* NodeId MSR */
++#define X86_FEATURE_TBM			( 6*32+21) /* Trailing Bit Manipulations */
++#define X86_FEATURE_TOPOEXT		( 6*32+22) /* Topology extensions CPUID leafs */
++#define X86_FEATURE_PERFCTR_CORE	( 6*32+23) /* Core performance counter extensions */
++#define X86_FEATURE_PERFCTR_NB		( 6*32+24) /* NB performance counter extensions */
++#define X86_FEATURE_BPEXT		( 6*32+26) /* Data breakpoint extension */
++#define X86_FEATURE_PTSC		( 6*32+27) /* Performance time-stamp counter */
++#define X86_FEATURE_PERFCTR_LLC		( 6*32+28) /* Last Level Cache performance counter extensions */
++#define X86_FEATURE_MWAITX		( 6*32+29) /* MWAIT extension (MONITORX/MWAITX instructions) */
+ 
+ /*
+  * Auxiliary flags: Linux defined - For features scattered in various
+@@ -187,146 +190,155 @@
+  *
+  * Reuse free bits when adding new feature flags!
+  */
+-#define X86_FEATURE_RING3MWAIT	( 7*32+ 0) /* Ring 3 MONITOR/MWAIT */
+-#define X86_FEATURE_CPUID_FAULT ( 7*32+ 1) /* Intel CPUID faulting */
+-#define X86_FEATURE_CPB		( 7*32+ 2) /* AMD Core Performance Boost */
+-#define X86_FEATURE_EPB		( 7*32+ 3) /* IA32_ENERGY_PERF_BIAS support */
+-#define X86_FEATURE_CAT_L3	( 7*32+ 4) /* Cache Allocation Technology L3 */
+-#define X86_FEATURE_CAT_L2	( 7*32+ 5) /* Cache Allocation Technology L2 */
+-#define X86_FEATURE_CDP_L3	( 7*32+ 6) /* Code and Data Prioritization L3 */
++#define X86_FEATURE_RING3MWAIT		( 7*32+ 0) /* Ring 3 MONITOR/MWAIT instructions */
++#define X86_FEATURE_CPUID_FAULT		( 7*32+ 1) /* Intel CPUID faulting */
++#define X86_FEATURE_CPB			( 7*32+ 2) /* AMD Core Performance Boost */
++#define X86_FEATURE_EPB			( 7*32+ 3) /* IA32_ENERGY_PERF_BIAS support */
++#define X86_FEATURE_CAT_L3		( 7*32+ 4) /* Cache Allocation Technology L3 */
++#define X86_FEATURE_CAT_L2		( 7*32+ 5) /* Cache Allocation Technology L2 */
++#define X86_FEATURE_CDP_L3		( 7*32+ 6) /* Code and Data Prioritization L3 */
+ 
+-#define X86_FEATURE_HW_PSTATE	( 7*32+ 8) /* AMD HW-PState */
+-#define X86_FEATURE_PROC_FEEDBACK ( 7*32+ 9) /* AMD ProcFeedbackInterface */
+-#define X86_FEATURE_SME		( 7*32+10) /* AMD Secure Memory Encryption */
++#define X86_FEATURE_HW_PSTATE		( 7*32+ 8) /* AMD HW-PState */
++#define X86_FEATURE_PROC_FEEDBACK	( 7*32+ 9) /* AMD ProcFeedbackInterface */
++#define X86_FEATURE_SME			( 7*32+10) /* AMD Secure Memory Encryption */
+ 
+-#define X86_FEATURE_INTEL_PPIN	( 7*32+14) /* Intel Processor Inventory Number */
+-#define X86_FEATURE_INTEL_PT	( 7*32+15) /* Intel Processor Trace */
+-#define X86_FEATURE_AVX512_4VNNIW (7*32+16) /* AVX-512 Neural Network Instructions */
+-#define X86_FEATURE_AVX512_4FMAPS (7*32+17) /* AVX-512 Multiply Accumulation Single precision */
++#define X86_FEATURE_INTEL_PPIN		( 7*32+14) /* Intel Processor Inventory Number */
++#define X86_FEATURE_INTEL_PT		( 7*32+15) /* Intel Processor Trace */
++#define X86_FEATURE_AVX512_4VNNIW	( 7*32+16) /* AVX-512 Neural Network Instructions */
++#define X86_FEATURE_AVX512_4FMAPS	( 7*32+17) /* AVX-512 Multiply Accumulation Single precision */
+ 
+-#define X86_FEATURE_MBA         ( 7*32+18) /* Memory Bandwidth Allocation */
++#define X86_FEATURE_MBA			( 7*32+18) /* Memory Bandwidth Allocation */
+ 
+ /* Virtualization flags: Linux defined, word 8 */
+-#define X86_FEATURE_TPR_SHADOW  ( 8*32+ 0) /* Intel TPR Shadow */
+-#define X86_FEATURE_VNMI        ( 8*32+ 1) /* Intel Virtual NMI */
+-#define X86_FEATURE_FLEXPRIORITY ( 8*32+ 2) /* Intel FlexPriority */
+-#define X86_FEATURE_EPT         ( 8*32+ 3) /* Intel Extended Page Table */
+-#define X86_FEATURE_VPID        ( 8*32+ 4) /* Intel Virtual Processor ID */
++#define X86_FEATURE_TPR_SHADOW		( 8*32+ 0) /* Intel TPR Shadow */
++#define X86_FEATURE_VNMI		( 8*32+ 1) /* Intel Virtual NMI */
++#define X86_FEATURE_FLEXPRIORITY	( 8*32+ 2) /* Intel FlexPriority */
++#define X86_FEATURE_EPT			( 8*32+ 3) /* Intel Extended Page Table */
++#define X86_FEATURE_VPID		( 8*32+ 4) /* Intel Virtual Processor ID */
+ 
+-#define X86_FEATURE_VMMCALL     ( 8*32+15) /* Prefer vmmcall to vmcall */
+-#define X86_FEATURE_XENPV       ( 8*32+16) /* "" Xen paravirtual guest */
++#define X86_FEATURE_VMMCALL		( 8*32+15) /* Prefer VMMCALL to VMCALL */
++#define X86_FEATURE_XENPV		( 8*32+16) /* "" Xen paravirtual guest */
+ 
+ 
+-/* Intel-defined CPU features, CPUID level 0x00000007:0 (ebx), word 9 */
+-#define X86_FEATURE_FSGSBASE	( 9*32+ 0) /* {RD/WR}{FS/GS}BASE instructions*/
+-#define X86_FEATURE_TSC_ADJUST	( 9*32+ 1) /* TSC adjustment MSR 0x3b */
+-#define X86_FEATURE_BMI1	( 9*32+ 3) /* 1st group bit manipulation extensions */
+-#define X86_FEATURE_HLE		( 9*32+ 4) /* Hardware Lock Elision */
+-#define X86_FEATURE_AVX2	( 9*32+ 5) /* AVX2 instructions */
+-#define X86_FEATURE_SMEP	( 9*32+ 7) /* Supervisor Mode Execution Protection */
+-#define X86_FEATURE_BMI2	( 9*32+ 8) /* 2nd group bit manipulation extensions */
+-#define X86_FEATURE_ERMS	( 9*32+ 9) /* Enhanced REP MOVSB/STOSB */
+-#define X86_FEATURE_INVPCID	( 9*32+10) /* Invalidate Processor Context ID */
+-#define X86_FEATURE_RTM		( 9*32+11) /* Restricted Transactional Memory */
+-#define X86_FEATURE_CQM		( 9*32+12) /* Cache QoS Monitoring */
+-#define X86_FEATURE_MPX		( 9*32+14) /* Memory Protection Extension */
+-#define X86_FEATURE_RDT_A	( 9*32+15) /* Resource Director Technology Allocation */
+-#define X86_FEATURE_AVX512F	( 9*32+16) /* AVX-512 Foundation */
+-#define X86_FEATURE_AVX512DQ	( 9*32+17) /* AVX-512 DQ (Double/Quad granular) Instructions */
+-#define X86_FEATURE_RDSEED	( 9*32+18) /* The RDSEED instruction */
+-#define X86_FEATURE_ADX		( 9*32+19) /* The ADCX and ADOX instructions */
+-#define X86_FEATURE_SMAP	( 9*32+20) /* Supervisor Mode Access Prevention */
+-#define X86_FEATURE_AVX512IFMA  ( 9*32+21) /* AVX-512 Integer Fused Multiply-Add instructions */
+-#define X86_FEATURE_CLFLUSHOPT	( 9*32+23) /* CLFLUSHOPT instruction */
+-#define X86_FEATURE_CLWB	( 9*32+24) /* CLWB instruction */
+-#define X86_FEATURE_AVX512PF	( 9*32+26) /* AVX-512 Prefetch */
+-#define X86_FEATURE_AVX512ER	( 9*32+27) /* AVX-512 Exponential and Reciprocal */
+-#define X86_FEATURE_AVX512CD	( 9*32+28) /* AVX-512 Conflict Detection */
+-#define X86_FEATURE_SHA_NI	( 9*32+29) /* SHA1/SHA256 Instruction Extensions */
+-#define X86_FEATURE_AVX512BW	( 9*32+30) /* AVX-512 BW (Byte/Word granular) Instructions */
+-#define X86_FEATURE_AVX512VL	( 9*32+31) /* AVX-512 VL (128/256 Vector Length) Extensions */
++/* Intel-defined CPU features, CPUID level 0x00000007:0 (EBX), word 9 */
++#define X86_FEATURE_FSGSBASE		( 9*32+ 0) /* RDFSBASE, WRFSBASE, RDGSBASE, WRGSBASE instructions*/
++#define X86_FEATURE_TSC_ADJUST		( 9*32+ 1) /* TSC adjustment MSR 0x3B */
++#define X86_FEATURE_BMI1		( 9*32+ 3) /* 1st group bit manipulation extensions */
++#define X86_FEATURE_HLE			( 9*32+ 4) /* Hardware Lock Elision */
++#define X86_FEATURE_AVX2		( 9*32+ 5) /* AVX2 instructions */
++#define X86_FEATURE_SMEP		( 9*32+ 7) /* Supervisor Mode Execution Protection */
++#define X86_FEATURE_BMI2		( 9*32+ 8) /* 2nd group bit manipulation extensions */
++#define X86_FEATURE_ERMS		( 9*32+ 9) /* Enhanced REP MOVSB/STOSB instructions */
++#define X86_FEATURE_INVPCID		( 9*32+10) /* Invalidate Processor Context ID */
++#define X86_FEATURE_RTM			( 9*32+11) /* Restricted Transactional Memory */
++#define X86_FEATURE_CQM			( 9*32+12) /* Cache QoS Monitoring */
++#define X86_FEATURE_MPX			( 9*32+14) /* Memory Protection Extension */
++#define X86_FEATURE_RDT_A		( 9*32+15) /* Resource Director Technology Allocation */
++#define X86_FEATURE_AVX512F		( 9*32+16) /* AVX-512 Foundation */
++#define X86_FEATURE_AVX512DQ		( 9*32+17) /* AVX-512 DQ (Double/Quad granular) Instructions */
++#define X86_FEATURE_RDSEED		( 9*32+18) /* RDSEED instruction */
++#define X86_FEATURE_ADX			( 9*32+19) /* ADCX and ADOX instructions */
++#define X86_FEATURE_SMAP		( 9*32+20) /* Supervisor Mode Access Prevention */
++#define X86_FEATURE_AVX512IFMA		( 9*32+21) /* AVX-512 Integer Fused Multiply-Add instructions */
++#define X86_FEATURE_CLFLUSHOPT		( 9*32+23) /* CLFLUSHOPT instruction */
++#define X86_FEATURE_CLWB		( 9*32+24) /* CLWB instruction */
++#define X86_FEATURE_AVX512PF		( 9*32+26) /* AVX-512 Prefetch */
++#define X86_FEATURE_AVX512ER		( 9*32+27) /* AVX-512 Exponential and Reciprocal */
++#define X86_FEATURE_AVX512CD		( 9*32+28) /* AVX-512 Conflict Detection */
++#define X86_FEATURE_SHA_NI		( 9*32+29) /* SHA1/SHA256 Instruction Extensions */
++#define X86_FEATURE_AVX512BW		( 9*32+30) /* AVX-512 BW (Byte/Word granular) Instructions */
++#define X86_FEATURE_AVX512VL		( 9*32+31) /* AVX-512 VL (128/256 Vector Length) Extensions */
+ 
+-/* Extended state features, CPUID level 0x0000000d:1 (eax), word 10 */
+-#define X86_FEATURE_XSAVEOPT	(10*32+ 0) /* XSAVEOPT */
+-#define X86_FEATURE_XSAVEC	(10*32+ 1) /* XSAVEC */
+-#define X86_FEATURE_XGETBV1	(10*32+ 2) /* XGETBV with ECX = 1 */
+-#define X86_FEATURE_XSAVES	(10*32+ 3) /* XSAVES/XRSTORS */
++/* Extended state features, CPUID level 0x0000000d:1 (EAX), word 10 */
++#define X86_FEATURE_XSAVEOPT		(10*32+ 0) /* XSAVEOPT instruction */
++#define X86_FEATURE_XSAVEC		(10*32+ 1) /* XSAVEC instruction */
++#define X86_FEATURE_XGETBV1		(10*32+ 2) /* XGETBV with ECX = 1 instruction */
++#define X86_FEATURE_XSAVES		(10*32+ 3) /* XSAVES/XRSTORS instructions */
+ 
+-/* Intel-defined CPU QoS Sub-leaf, CPUID level 0x0000000F:0 (edx), word 11 */
+-#define X86_FEATURE_CQM_LLC	(11*32+ 1) /* LLC QoS if 1 */
++/* Intel-defined CPU QoS Sub-leaf, CPUID level 0x0000000F:0 (EDX), word 11 */
++#define X86_FEATURE_CQM_LLC		(11*32+ 1) /* LLC QoS if 1 */
+ 
+-/* Intel-defined CPU QoS Sub-leaf, CPUID level 0x0000000F:1 (edx), word 12 */
+-#define X86_FEATURE_CQM_OCCUP_LLC (12*32+ 0) /* LLC occupancy monitoring if 1 */
+-#define X86_FEATURE_CQM_MBM_TOTAL (12*32+ 1) /* LLC Total MBM monitoring */
+-#define X86_FEATURE_CQM_MBM_LOCAL (12*32+ 2) /* LLC Local MBM monitoring */
++/* Intel-defined CPU QoS Sub-leaf, CPUID level 0x0000000F:1 (EDX), word 12 */
++#define X86_FEATURE_CQM_OCCUP_LLC	(12*32+ 0) /* LLC occupancy monitoring */
++#define X86_FEATURE_CQM_MBM_TOTAL	(12*32+ 1) /* LLC Total MBM monitoring */
++#define X86_FEATURE_CQM_MBM_LOCAL	(12*32+ 2) /* LLC Local MBM monitoring */
+ 
+-/* AMD-defined CPU features, CPUID level 0x80000008 (ebx), word 13 */
+-#define X86_FEATURE_CLZERO	(13*32+0) /* CLZERO instruction */
+-#define X86_FEATURE_IRPERF	(13*32+1) /* Instructions Retired Count */
++/* AMD-defined CPU features, CPUID level 0x80000008 (EBX), word 13 */
++#define X86_FEATURE_CLZERO		(13*32+ 0) /* CLZERO instruction */
++#define X86_FEATURE_IRPERF		(13*32+ 1) /* Instructions Retired Count */
++#define X86_FEATURE_XSAVEERPTR		(13*32+ 2) /* Always save/restore FP error pointers */
+ 
+-/* Thermal and Power Management Leaf, CPUID level 0x00000006 (eax), word 14 */
+-#define X86_FEATURE_DTHERM	(14*32+ 0) /* Digital Thermal Sensor */
+-#define X86_FEATURE_IDA		(14*32+ 1) /* Intel Dynamic Acceleration */
+-#define X86_FEATURE_ARAT	(14*32+ 2) /* Always Running APIC Timer */
+-#define X86_FEATURE_PLN		(14*32+ 4) /* Intel Power Limit Notification */
+-#define X86_FEATURE_PTS		(14*32+ 6) /* Intel Package Thermal Status */
+-#define X86_FEATURE_HWP		(14*32+ 7) /* Intel Hardware P-states */
+-#define X86_FEATURE_HWP_NOTIFY	(14*32+ 8) /* HWP Notification */
+-#define X86_FEATURE_HWP_ACT_WINDOW (14*32+ 9) /* HWP Activity Window */
+-#define X86_FEATURE_HWP_EPP	(14*32+10) /* HWP Energy Perf. Preference */
+-#define X86_FEATURE_HWP_PKG_REQ (14*32+11) /* HWP Package Level Request */
++/* Thermal and Power Management Leaf, CPUID level 0x00000006 (EAX), word 14 */
++#define X86_FEATURE_DTHERM		(14*32+ 0) /* Digital Thermal Sensor */
++#define X86_FEATURE_IDA			(14*32+ 1) /* Intel Dynamic Acceleration */
++#define X86_FEATURE_ARAT		(14*32+ 2) /* Always Running APIC Timer */
++#define X86_FEATURE_PLN			(14*32+ 4) /* Intel Power Limit Notification */
++#define X86_FEATURE_PTS			(14*32+ 6) /* Intel Package Thermal Status */
++#define X86_FEATURE_HWP			(14*32+ 7) /* Intel Hardware P-states */
++#define X86_FEATURE_HWP_NOTIFY		(14*32+ 8) /* HWP Notification */
++#define X86_FEATURE_HWP_ACT_WINDOW	(14*32+ 9) /* HWP Activity Window */
++#define X86_FEATURE_HWP_EPP		(14*32+10) /* HWP Energy Perf. Preference */
++#define X86_FEATURE_HWP_PKG_REQ		(14*32+11) /* HWP Package Level Request */
+ 
+-/* AMD SVM Feature Identification, CPUID level 0x8000000a (edx), word 15 */
+-#define X86_FEATURE_NPT		(15*32+ 0) /* Nested Page Table support */
+-#define X86_FEATURE_LBRV	(15*32+ 1) /* LBR Virtualization support */
+-#define X86_FEATURE_SVML	(15*32+ 2) /* "svm_lock" SVM locking MSR */
+-#define X86_FEATURE_NRIPS	(15*32+ 3) /* "nrip_save" SVM next_rip save */
+-#define X86_FEATURE_TSCRATEMSR  (15*32+ 4) /* "tsc_scale" TSC scaling support */
+-#define X86_FEATURE_VMCBCLEAN   (15*32+ 5) /* "vmcb_clean" VMCB clean bits support */
+-#define X86_FEATURE_FLUSHBYASID (15*32+ 6) /* flush-by-ASID support */
+-#define X86_FEATURE_DECODEASSISTS (15*32+ 7) /* Decode Assists support */
+-#define X86_FEATURE_PAUSEFILTER (15*32+10) /* filtered pause intercept */
+-#define X86_FEATURE_PFTHRESHOLD (15*32+12) /* pause filter threshold */
+-#define X86_FEATURE_AVIC	(15*32+13) /* Virtual Interrupt Controller */
+-#define X86_FEATURE_V_VMSAVE_VMLOAD (15*32+15) /* Virtual VMSAVE VMLOAD */
+-#define X86_FEATURE_VGIF	(15*32+16) /* Virtual GIF */
++/* AMD SVM Feature Identification, CPUID level 0x8000000a (EDX), word 15 */
++#define X86_FEATURE_NPT			(15*32+ 0) /* Nested Page Table support */
++#define X86_FEATURE_LBRV		(15*32+ 1) /* LBR Virtualization support */
++#define X86_FEATURE_SVML		(15*32+ 2) /* "svm_lock" SVM locking MSR */
++#define X86_FEATURE_NRIPS		(15*32+ 3) /* "nrip_save" SVM next_rip save */
++#define X86_FEATURE_TSCRATEMSR		(15*32+ 4) /* "tsc_scale" TSC scaling support */
++#define X86_FEATURE_VMCBCLEAN		(15*32+ 5) /* "vmcb_clean" VMCB clean bits support */
++#define X86_FEATURE_FLUSHBYASID		(15*32+ 6) /* flush-by-ASID support */
++#define X86_FEATURE_DECODEASSISTS	(15*32+ 7) /* Decode Assists support */
++#define X86_FEATURE_PAUSEFILTER		(15*32+10) /* filtered pause intercept */
++#define X86_FEATURE_PFTHRESHOLD		(15*32+12) /* pause filter threshold */
++#define X86_FEATURE_AVIC		(15*32+13) /* Virtual Interrupt Controller */
++#define X86_FEATURE_V_VMSAVE_VMLOAD	(15*32+15) /* Virtual VMSAVE VMLOAD */
++#define X86_FEATURE_VGIF		(15*32+16) /* Virtual GIF */
+ 
+-/* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx), word 16 */
+-#define X86_FEATURE_AVX512VBMI  (16*32+ 1) /* AVX512 Vector Bit Manipulation instructions*/
+-#define X86_FEATURE_PKU		(16*32+ 3) /* Protection Keys for Userspace */
+-#define X86_FEATURE_OSPKE	(16*32+ 4) /* OS Protection Keys Enable */
+-#define X86_FEATURE_AVX512_VPOPCNTDQ (16*32+14) /* POPCNT for vectors of DW/QW */
+-#define X86_FEATURE_LA57	(16*32+16) /* 5-level page tables */
+-#define X86_FEATURE_RDPID	(16*32+22) /* RDPID instruction */
++/* Intel-defined CPU features, CPUID level 0x00000007:0 (ECX), word 16 */
++#define X86_FEATURE_AVX512VBMI		(16*32+ 1) /* AVX512 Vector Bit Manipulation instructions*/
++#define X86_FEATURE_UMIP		(16*32+ 2) /* User Mode Instruction Protection */
++#define X86_FEATURE_PKU			(16*32+ 3) /* Protection Keys for Userspace */
++#define X86_FEATURE_OSPKE		(16*32+ 4) /* OS Protection Keys Enable */
++#define X86_FEATURE_AVX512_VBMI2	(16*32+ 6) /* Additional AVX512 Vector Bit Manipulation Instructions */
++#define X86_FEATURE_GFNI		(16*32+ 8) /* Galois Field New Instructions */
++#define X86_FEATURE_VAES		(16*32+ 9) /* Vector AES */
++#define X86_FEATURE_VPCLMULQDQ		(16*32+10) /* Carry-Less Multiplication Double Quadword */
++#define X86_FEATURE_AVX512_VNNI		(16*32+11) /* Vector Neural Network Instructions */
++#define X86_FEATURE_AVX512_BITALG	(16*32+12) /* Support for VPOPCNT[B,W] and VPSHUF-BITQMB instructions */
++#define X86_FEATURE_AVX512_VPOPCNTDQ	(16*32+14) /* POPCNT for vectors of DW/QW */
++#define X86_FEATURE_LA57		(16*32+16) /* 5-level page tables */
++#define X86_FEATURE_RDPID		(16*32+22) /* RDPID instruction */
+ 
+-/* AMD-defined CPU features, CPUID level 0x80000007 (ebx), word 17 */
+-#define X86_FEATURE_OVERFLOW_RECOV (17*32+0) /* MCA overflow recovery support */
+-#define X86_FEATURE_SUCCOR	(17*32+1) /* Uncorrectable error containment and recovery */
+-#define X86_FEATURE_SMCA	(17*32+3) /* Scalable MCA */
++/* AMD-defined CPU features, CPUID level 0x80000007 (EBX), word 17 */
++#define X86_FEATURE_OVERFLOW_RECOV	(17*32+ 0) /* MCA overflow recovery support */
++#define X86_FEATURE_SUCCOR		(17*32+ 1) /* Uncorrectable error containment and recovery */
++#define X86_FEATURE_SMCA		(17*32+ 3) /* Scalable MCA */
+ 
+ /*
+  * BUG word(s)
+  */
+-#define X86_BUG(x)		(NCAPINTS*32 + (x))
++#define X86_BUG(x)			(NCAPINTS*32 + (x))
+ 
+-#define X86_BUG_F00F		X86_BUG(0) /* Intel F00F */
+-#define X86_BUG_FDIV		X86_BUG(1) /* FPU FDIV */
+-#define X86_BUG_COMA		X86_BUG(2) /* Cyrix 6x86 coma */
+-#define X86_BUG_AMD_TLB_MMATCH	X86_BUG(3) /* "tlb_mmatch" AMD Erratum 383 */
+-#define X86_BUG_AMD_APIC_C1E	X86_BUG(4) /* "apic_c1e" AMD Erratum 400 */
+-#define X86_BUG_11AP		X86_BUG(5) /* Bad local APIC aka 11AP */
+-#define X86_BUG_FXSAVE_LEAK	X86_BUG(6) /* FXSAVE leaks FOP/FIP/FOP */
+-#define X86_BUG_CLFLUSH_MONITOR	X86_BUG(7) /* AAI65, CLFLUSH required before MONITOR */
+-#define X86_BUG_SYSRET_SS_ATTRS	X86_BUG(8) /* SYSRET doesn't fix up SS attrs */
++#define X86_BUG_F00F			X86_BUG(0) /* Intel F00F */
++#define X86_BUG_FDIV			X86_BUG(1) /* FPU FDIV */
++#define X86_BUG_COMA			X86_BUG(2) /* Cyrix 6x86 coma */
++#define X86_BUG_AMD_TLB_MMATCH		X86_BUG(3) /* "tlb_mmatch" AMD Erratum 383 */
++#define X86_BUG_AMD_APIC_C1E		X86_BUG(4) /* "apic_c1e" AMD Erratum 400 */
++#define X86_BUG_11AP			X86_BUG(5) /* Bad local APIC aka 11AP */
++#define X86_BUG_FXSAVE_LEAK		X86_BUG(6) /* FXSAVE leaks FOP/FIP/FOP */
++#define X86_BUG_CLFLUSH_MONITOR		X86_BUG(7) /* AAI65, CLFLUSH required before MONITOR */
++#define X86_BUG_SYSRET_SS_ATTRS		X86_BUG(8) /* SYSRET doesn't fix up SS attrs */
+ #ifdef CONFIG_X86_32
+ /*
+  * 64-bit kernels don't use X86_BUG_ESPFIX.  Make the define conditional
+  * to avoid confusion.
+  */
+-#define X86_BUG_ESPFIX		X86_BUG(9) /* "" IRET to 16-bit SS corrupts ESP/RSP high bits */
++#define X86_BUG_ESPFIX			X86_BUG(9) /* "" IRET to 16-bit SS corrupts ESP/RSP high bits */
+ #endif
+-#define X86_BUG_NULL_SEG	X86_BUG(10) /* Nulling a selector preserves the base */
+-#define X86_BUG_SWAPGS_FENCE	X86_BUG(11) /* SWAPGS without input dep on GS */
+-#define X86_BUG_MONITOR		X86_BUG(12) /* IPI required to wake up remote CPU */
+-#define X86_BUG_AMD_E400	X86_BUG(13) /* CPU is among the affected by Erratum 400 */
++#define X86_BUG_NULL_SEG		X86_BUG(10) /* Nulling a selector preserves the base */
++#define X86_BUG_SWAPGS_FENCE		X86_BUG(11) /* SWAPGS without input dep on GS */
++#define X86_BUG_MONITOR			X86_BUG(12) /* IPI required to wake up remote CPU */
++#define X86_BUG_AMD_E400		X86_BUG(13) /* CPU is among the affected by Erratum 400 */
++
+ #endif /* _ASM_X86_CPUFEATURES_H */
+diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h
+index 0a3e808b9123..2ace1f90d138 100644
+--- a/arch/x86/include/asm/desc.h
++++ b/arch/x86/include/asm/desc.h
+@@ -60,17 +60,10 @@ static inline struct desc_struct *get_current_gdt_rw(void)
+ 	return this_cpu_ptr(&gdt_page)->gdt;
+ }
+ 
+-/* Get the fixmap index for a specific processor */
+-static inline unsigned int get_cpu_gdt_ro_index(int cpu)
+-{
+-	return FIX_GDT_REMAP_BEGIN + cpu;
+-}
+-
+ /* Provide the fixmap address of the remapped GDT */
+ static inline struct desc_struct *get_cpu_gdt_ro(int cpu)
+ {
+-	unsigned int idx = get_cpu_gdt_ro_index(cpu);
+-	return (struct desc_struct *)__fix_to_virt(idx);
++	return (struct desc_struct *)&get_cpu_entry_area(cpu)->gdt;
+ }
+ 
+ /* Provide the current read-only GDT */
+@@ -185,7 +178,7 @@ static inline void set_tssldt_descriptor(void *d, unsigned long addr,
+ #endif
+ }
+ 
+-static inline void __set_tss_desc(unsigned cpu, unsigned int entry, void *addr)
++static inline void __set_tss_desc(unsigned cpu, unsigned int entry, struct x86_hw_tss *addr)
+ {
+ 	struct desc_struct *d = get_cpu_gdt_rw(cpu);
+ 	tss_desc tss;
+diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h
+index dcd9fb55e679..94fc4fa14127 100644
+--- a/arch/x86/include/asm/fixmap.h
++++ b/arch/x86/include/asm/fixmap.h
+@@ -44,6 +44,45 @@ extern unsigned long __FIXADDR_TOP;
+ 			 PAGE_SIZE)
+ #endif
+ 
++/*
++ * cpu_entry_area is a percpu region in the fixmap that contains things
++ * needed by the CPU and early entry/exit code.  Real types aren't used
++ * for all fields here to avoid circular header dependencies.
++ *
++ * Every field is a virtual alias of some other allocated backing store.
++ * There is no direct allocation of a struct cpu_entry_area.
++ */
++struct cpu_entry_area {
++	char gdt[PAGE_SIZE];
++
++	/*
++	 * The GDT is just below SYSENTER_stack and thus serves (on x86_64) as
++	 * a a read-only guard page.
++	 */
++	struct SYSENTER_stack_page SYSENTER_stack_page;
++
++	/*
++	 * On x86_64, the TSS is mapped RO.  On x86_32, it's mapped RW because
++	 * we need task switches to work, and task switches write to the TSS.
++	 */
++	struct tss_struct tss;
++
++	char entry_trampoline[PAGE_SIZE];
++
++#ifdef CONFIG_X86_64
++	/*
++	 * Exception stacks used for IST entries.
++	 *
++	 * In the future, this should have a separate slot for each stack
++	 * with guard pages between them.
++	 */
++	char exception_stacks[(N_EXCEPTION_STACKS - 1) * EXCEPTION_STKSZ + DEBUG_STKSZ];
++#endif
++};
++
++#define CPU_ENTRY_AREA_PAGES (sizeof(struct cpu_entry_area) / PAGE_SIZE)
++
++extern void setup_cpu_entry_areas(void);
+ 
+ /*
+  * Here we define all the compile-time 'special' virtual
+@@ -101,8 +140,14 @@ enum fixed_addresses {
+ 	FIX_LNW_VRTC,
+ #endif
+ 	/* Fixmap entries to remap the GDTs, one per processor. */
+-	FIX_GDT_REMAP_BEGIN,
+-	FIX_GDT_REMAP_END = FIX_GDT_REMAP_BEGIN + NR_CPUS - 1,
++	FIX_CPU_ENTRY_AREA_TOP,
++	FIX_CPU_ENTRY_AREA_BOTTOM = FIX_CPU_ENTRY_AREA_TOP + (CPU_ENTRY_AREA_PAGES * NR_CPUS) - 1,
++
++#ifdef CONFIG_ACPI_APEI_GHES
++	/* Used for GHES mapping from assorted contexts */
++	FIX_APEI_GHES_IRQ,
++	FIX_APEI_GHES_NMI,
++#endif
+ 
+ 	__end_of_permanent_fixed_addresses,
+ 
+@@ -185,5 +230,30 @@ void __init *early_memremap_decrypted_wp(resource_size_t phys_addr,
+ void __early_set_fixmap(enum fixed_addresses idx,
+ 			phys_addr_t phys, pgprot_t flags);
+ 
++static inline unsigned int __get_cpu_entry_area_page_index(int cpu, int page)
++{
++	BUILD_BUG_ON(sizeof(struct cpu_entry_area) % PAGE_SIZE != 0);
++
++	return FIX_CPU_ENTRY_AREA_BOTTOM - cpu*CPU_ENTRY_AREA_PAGES - page;
++}
++
++#define __get_cpu_entry_area_offset_index(cpu, offset) ({		\
++	BUILD_BUG_ON(offset % PAGE_SIZE != 0);				\
++	__get_cpu_entry_area_page_index(cpu, offset / PAGE_SIZE);	\
++	})
++
++#define get_cpu_entry_area_index(cpu, field)				\
++	__get_cpu_entry_area_offset_index((cpu), offsetof(struct cpu_entry_area, field))
++
++static inline struct cpu_entry_area *get_cpu_entry_area(int cpu)
++{
++	return (struct cpu_entry_area *)__fix_to_virt(__get_cpu_entry_area_page_index(cpu, 0));
++}
++
++static inline struct SYSENTER_stack *cpu_SYSENTER_stack(int cpu)
++{
++	return &get_cpu_entry_area(cpu)->SYSENTER_stack_page.stack;
++}
++
+ #endif /* !__ASSEMBLY__ */
+ #endif /* _ASM_X86_FIXMAP_H */
+diff --git a/arch/x86/include/asm/hypervisor.h b/arch/x86/include/asm/hypervisor.h
+index 0ead9dbb9130..96aa6b9884dc 100644
+--- a/arch/x86/include/asm/hypervisor.h
++++ b/arch/x86/include/asm/hypervisor.h
+@@ -20,14 +20,22 @@
+ #ifndef _ASM_X86_HYPERVISOR_H
+ #define _ASM_X86_HYPERVISOR_H
+ 
++/* x86 hypervisor types  */
++enum x86_hypervisor_type {
++	X86_HYPER_NATIVE = 0,
++	X86_HYPER_VMWARE,
++	X86_HYPER_MS_HYPERV,
++	X86_HYPER_XEN_PV,
++	X86_HYPER_XEN_HVM,
++	X86_HYPER_KVM,
++};
++
+ #ifdef CONFIG_HYPERVISOR_GUEST
+ 
+ #include <asm/kvm_para.h>
++#include <asm/x86_init.h>
+ #include <asm/xen/hypervisor.h>
+ 
+-/*
+- * x86 hypervisor information
+- */
+ struct hypervisor_x86 {
+ 	/* Hypervisor name */
+ 	const char	*name;
+@@ -35,40 +43,27 @@ struct hypervisor_x86 {
+ 	/* Detection routine */
+ 	uint32_t	(*detect)(void);
+ 
+-	/* Platform setup (run once per boot) */
+-	void		(*init_platform)(void);
+-
+-	/* X2APIC detection (run once per boot) */
+-	bool		(*x2apic_available)(void);
++	/* Hypervisor type */
++	enum x86_hypervisor_type type;
+ 
+-	/* pin current vcpu to specified physical cpu (run rarely) */
+-	void		(*pin_vcpu)(int);
++	/* init time callbacks */
++	struct x86_hyper_init init;
+ 
+-	/* called during init_mem_mapping() to setup early mappings. */
+-	void		(*init_mem_mapping)(void);
++	/* runtime callbacks */
++	struct x86_hyper_runtime runtime;
+ };
+ 
+-extern const struct hypervisor_x86 *x86_hyper;
+-
+-/* Recognized hypervisors */
+-extern const struct hypervisor_x86 x86_hyper_vmware;
+-extern const struct hypervisor_x86 x86_hyper_ms_hyperv;
+-extern const struct hypervisor_x86 x86_hyper_xen_pv;
+-extern const struct hypervisor_x86 x86_hyper_xen_hvm;
+-extern const struct hypervisor_x86 x86_hyper_kvm;
+-
++extern enum x86_hypervisor_type x86_hyper_type;
+ extern void init_hypervisor_platform(void);
+-extern bool hypervisor_x2apic_available(void);
+-extern void hypervisor_pin_vcpu(int cpu);
+-
+-static inline void hypervisor_init_mem_mapping(void)
++static inline bool hypervisor_is_type(enum x86_hypervisor_type type)
+ {
+-	if (x86_hyper && x86_hyper->init_mem_mapping)
+-		x86_hyper->init_mem_mapping();
++	return x86_hyper_type == type;
+ }
+ #else
+ static inline void init_hypervisor_platform(void) { }
+-static inline bool hypervisor_x2apic_available(void) { return false; }
+-static inline void hypervisor_init_mem_mapping(void) { }
++static inline bool hypervisor_is_type(enum x86_hypervisor_type type)
++{
++	return type == X86_HYPER_NATIVE;
++}
+ #endif /* CONFIG_HYPERVISOR_GUEST */
+ #endif /* _ASM_X86_HYPERVISOR_H */
+diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h
+index c8ef23f2c28f..89f08955fff7 100644
+--- a/arch/x86/include/asm/irqflags.h
++++ b/arch/x86/include/asm/irqflags.h
+@@ -142,6 +142,9 @@ static inline notrace unsigned long arch_local_irq_save(void)
+ 	swapgs;					\
+ 	sysretl
+ 
++#ifdef CONFIG_DEBUG_ENTRY
++#define SAVE_FLAGS(x)		pushfq; popq %rax
++#endif
+ #else
+ #define INTERRUPT_RETURN		iret
+ #define ENABLE_INTERRUPTS_SYSEXIT	sti; sysexit
+diff --git a/arch/x86/include/asm/kdebug.h b/arch/x86/include/asm/kdebug.h
+index f86a8caa561e..395c9631e000 100644
+--- a/arch/x86/include/asm/kdebug.h
++++ b/arch/x86/include/asm/kdebug.h
+@@ -26,6 +26,7 @@ extern void die(const char *, struct pt_regs *,long);
+ extern int __must_check __die(const char *, struct pt_regs *, long);
+ extern void show_stack_regs(struct pt_regs *regs);
+ extern void __show_regs(struct pt_regs *regs, int all);
++extern void show_iret_regs(struct pt_regs *regs);
+ extern unsigned long oops_begin(void);
+ extern void oops_end(unsigned long, struct pt_regs *, int signr);
+ 
+diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
+index 6699fc441644..6d16d15d09a0 100644
+--- a/arch/x86/include/asm/mmu_context.h
++++ b/arch/x86/include/asm/mmu_context.h
+@@ -73,8 +73,8 @@ static inline void load_mm_ldt(struct mm_struct *mm)
+ #ifdef CONFIG_MODIFY_LDT_SYSCALL
+ 	struct ldt_struct *ldt;
+ 
+-	/* lockless_dereference synchronizes with smp_store_release */
+-	ldt = lockless_dereference(mm->context.ldt);
++	/* READ_ONCE synchronizes with smp_store_release */
++	ldt = READ_ONCE(mm->context.ldt);
+ 
+ 	/*
+ 	 * Any change to mm->context.ldt is followed by an IPI to all
+diff --git a/arch/x86/include/asm/module.h b/arch/x86/include/asm/module.h
+index 8546fafa21a9..7948a17febb4 100644
+--- a/arch/x86/include/asm/module.h
++++ b/arch/x86/include/asm/module.h
+@@ -6,7 +6,7 @@
+ #include <asm/orc_types.h>
+ 
+ struct mod_arch_specific {
+-#ifdef CONFIG_ORC_UNWINDER
++#ifdef CONFIG_UNWINDER_ORC
+ 	unsigned int num_orcs;
+ 	int *orc_unwind_ip;
+ 	struct orc_entry *orc_unwind;
+diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
+index fd81228e8037..892df375b615 100644
+--- a/arch/x86/include/asm/paravirt.h
++++ b/arch/x86/include/asm/paravirt.h
+@@ -16,10 +16,9 @@
+ #include <linux/cpumask.h>
+ #include <asm/frame.h>
+ 
+-static inline void load_sp0(struct tss_struct *tss,
+-			     struct thread_struct *thread)
++static inline void load_sp0(unsigned long sp0)
+ {
+-	PVOP_VCALL2(pv_cpu_ops.load_sp0, tss, thread);
++	PVOP_VCALL1(pv_cpu_ops.load_sp0, sp0);
+ }
+ 
+ /* The paravirtualized CPUID instruction. */
+@@ -928,6 +927,15 @@ extern void default_banner(void);
+ 	PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_usergs_sysret64),	\
+ 		  CLBR_NONE,						\
+ 		  jmp PARA_INDIRECT(pv_cpu_ops+PV_CPU_usergs_sysret64))
++
++#ifdef CONFIG_DEBUG_ENTRY
++#define SAVE_FLAGS(clobbers)                                        \
++	PARA_SITE(PARA_PATCH(pv_irq_ops, PV_IRQ_save_fl), clobbers, \
++		  PV_SAVE_REGS(clobbers | CLBR_CALLEE_SAVE);        \
++		  call PARA_INDIRECT(pv_irq_ops+PV_IRQ_save_fl);    \
++		  PV_RESTORE_REGS(clobbers | CLBR_CALLEE_SAVE);)
++#endif
++
+ #endif	/* CONFIG_X86_32 */
+ 
+ #endif /* __ASSEMBLY__ */
+diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
+index 10cc3b9709fe..6ec54d01972d 100644
+--- a/arch/x86/include/asm/paravirt_types.h
++++ b/arch/x86/include/asm/paravirt_types.h
+@@ -134,7 +134,7 @@ struct pv_cpu_ops {
+ 	void (*alloc_ldt)(struct desc_struct *ldt, unsigned entries);
+ 	void (*free_ldt)(struct desc_struct *ldt, unsigned entries);
+ 
+-	void (*load_sp0)(struct tss_struct *tss, struct thread_struct *t);
++	void (*load_sp0)(unsigned long sp0);
+ 
+ 	void (*set_iopl_mask)(unsigned mask);
+ 
+diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h
+index 377f1ffd18be..ba3c523aaf16 100644
+--- a/arch/x86/include/asm/percpu.h
++++ b/arch/x86/include/asm/percpu.h
+@@ -526,7 +526,7 @@ static inline bool x86_this_cpu_variable_test_bit(int nr,
+ {
+ 	bool oldbit;
+ 
+-	asm volatile("bt "__percpu_arg(2)",%1\n\t"
++	asm volatile("bt "__percpu_arg(2)",%1"
+ 			CC_SET(c)
+ 			: CC_OUT(c) (oldbit)
+ 			: "m" (*(unsigned long __percpu *)addr), "Ir" (nr));
+diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
+index 59df7b47a434..9e9b05fc4860 100644
+--- a/arch/x86/include/asm/pgtable_types.h
++++ b/arch/x86/include/asm/pgtable_types.h
+@@ -200,10 +200,9 @@ enum page_cache_mode {
+ 
+ #define _PAGE_ENC	(_AT(pteval_t, sme_me_mask))
+ 
+-#define _PAGE_TABLE	(_PAGE_PRESENT | _PAGE_RW | _PAGE_USER |	\
+-			 _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_ENC)
+ #define _KERNPG_TABLE	(_PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED |	\
+ 			 _PAGE_DIRTY | _PAGE_ENC)
++#define _PAGE_TABLE	(_KERNPG_TABLE | _PAGE_USER)
+ 
+ #define __PAGE_KERNEL_ENC	(__PAGE_KERNEL | _PAGE_ENC)
+ #define __PAGE_KERNEL_ENC_WP	(__PAGE_KERNEL_WP | _PAGE_ENC)
+diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
+index bdac19ab2488..da943411d3d8 100644
+--- a/arch/x86/include/asm/processor.h
++++ b/arch/x86/include/asm/processor.h
+@@ -162,9 +162,9 @@ enum cpuid_regs_idx {
+ extern struct cpuinfo_x86	boot_cpu_data;
+ extern struct cpuinfo_x86	new_cpu_data;
+ 
+-extern struct tss_struct	doublefault_tss;
+-extern __u32			cpu_caps_cleared[NCAPINTS];
+-extern __u32			cpu_caps_set[NCAPINTS];
++extern struct x86_hw_tss	doublefault_tss;
++extern __u32			cpu_caps_cleared[NCAPINTS + NBUGINTS];
++extern __u32			cpu_caps_set[NCAPINTS + NBUGINTS];
+ 
+ #ifdef CONFIG_SMP
+ DECLARE_PER_CPU_READ_MOSTLY(struct cpuinfo_x86, cpu_info);
+@@ -252,6 +252,11 @@ static inline void load_cr3(pgd_t *pgdir)
+ 	write_cr3(__sme_pa(pgdir));
+ }
+ 
++/*
++ * Note that while the legacy 'TSS' name comes from 'Task State Segment',
++ * on modern x86 CPUs the TSS also holds information important to 64-bit mode,
++ * unrelated to the task-switch mechanism:
++ */
+ #ifdef CONFIG_X86_32
+ /* This is the TSS defined by the hardware. */
+ struct x86_hw_tss {
+@@ -304,7 +309,13 @@ struct x86_hw_tss {
+ struct x86_hw_tss {
+ 	u32			reserved1;
+ 	u64			sp0;
++
++	/*
++	 * We store cpu_current_top_of_stack in sp1 so it's always accessible.
++	 * Linux does not use ring 1, so sp1 is not otherwise needed.
++	 */
+ 	u64			sp1;
++
+ 	u64			sp2;
+ 	u64			reserved2;
+ 	u64			ist[7];
+@@ -322,12 +333,22 @@ struct x86_hw_tss {
+ #define IO_BITMAP_BITS			65536
+ #define IO_BITMAP_BYTES			(IO_BITMAP_BITS/8)
+ #define IO_BITMAP_LONGS			(IO_BITMAP_BYTES/sizeof(long))
+-#define IO_BITMAP_OFFSET		offsetof(struct tss_struct, io_bitmap)
++#define IO_BITMAP_OFFSET		(offsetof(struct tss_struct, io_bitmap) - offsetof(struct tss_struct, x86_tss))
+ #define INVALID_IO_BITMAP_OFFSET	0x8000
+ 
++struct SYSENTER_stack {
++	unsigned long		words[64];
++};
++
++struct SYSENTER_stack_page {
++	struct SYSENTER_stack stack;
++} __aligned(PAGE_SIZE);
++
+ struct tss_struct {
+ 	/*
+-	 * The hardware state:
++	 * The fixed hardware portion.  This must not cross a page boundary
++	 * at risk of violating the SDM's advice and potentially triggering
++	 * errata.
+ 	 */
+ 	struct x86_hw_tss	x86_tss;
+ 
+@@ -338,18 +359,9 @@ struct tss_struct {
+ 	 * be within the limit.
+ 	 */
+ 	unsigned long		io_bitmap[IO_BITMAP_LONGS + 1];
++} __aligned(PAGE_SIZE);
+ 
+-#ifdef CONFIG_X86_32
+-	/*
+-	 * Space for the temporary SYSENTER stack.
+-	 */
+-	unsigned long		SYSENTER_stack_canary;
+-	unsigned long		SYSENTER_stack[64];
+-#endif
+-
+-} ____cacheline_aligned;
+-
+-DECLARE_PER_CPU_SHARED_ALIGNED(struct tss_struct, cpu_tss);
++DECLARE_PER_CPU_PAGE_ALIGNED(struct tss_struct, cpu_tss_rw);
+ 
+ /*
+  * sizeof(unsigned long) coming from an extra "long" at the end
+@@ -363,6 +375,9 @@ DECLARE_PER_CPU_SHARED_ALIGNED(struct tss_struct, cpu_tss);
+ 
+ #ifdef CONFIG_X86_32
+ DECLARE_PER_CPU(unsigned long, cpu_current_top_of_stack);
++#else
++/* The RO copy can't be accessed with this_cpu_xyz(), so use the RW copy. */
++#define cpu_current_top_of_stack cpu_tss_rw.x86_tss.sp1
+ #endif
+ 
+ /*
+@@ -431,7 +446,9 @@ typedef struct {
+ struct thread_struct {
+ 	/* Cached TLS descriptors: */
+ 	struct desc_struct	tls_array[GDT_ENTRY_TLS_ENTRIES];
++#ifdef CONFIG_X86_32
+ 	unsigned long		sp0;
++#endif
+ 	unsigned long		sp;
+ #ifdef CONFIG_X86_32
+ 	unsigned long		sysenter_cs;
+@@ -518,16 +535,9 @@ static inline void native_set_iopl_mask(unsigned mask)
+ }
+ 
+ static inline void
+-native_load_sp0(struct tss_struct *tss, struct thread_struct *thread)
++native_load_sp0(unsigned long sp0)
+ {
+-	tss->x86_tss.sp0 = thread->sp0;
+-#ifdef CONFIG_X86_32
+-	/* Only happens when SEP is enabled, no need to test "SEP"arately: */
+-	if (unlikely(tss->x86_tss.ss1 != thread->sysenter_cs)) {
+-		tss->x86_tss.ss1 = thread->sysenter_cs;
+-		wrmsr(MSR_IA32_SYSENTER_CS, thread->sysenter_cs, 0);
+-	}
+-#endif
++	this_cpu_write(cpu_tss_rw.x86_tss.sp0, sp0);
+ }
+ 
+ static inline void native_swapgs(void)
+@@ -539,12 +549,18 @@ static inline void native_swapgs(void)
+ 
+ static inline unsigned long current_top_of_stack(void)
+ {
+-#ifdef CONFIG_X86_64
+-	return this_cpu_read_stable(cpu_tss.x86_tss.sp0);
+-#else
+-	/* sp0 on x86_32 is special in and around vm86 mode. */
++	/*
++	 *  We can't read directly from tss.sp0: sp0 on x86_32 is special in
++	 *  and around vm86 mode and sp0 on x86_64 is special because of the
++	 *  entry trampoline.
++	 */
+ 	return this_cpu_read_stable(cpu_current_top_of_stack);
+-#endif
++}
++
++static inline bool on_thread_stack(void)
++{
++	return (unsigned long)(current_top_of_stack() -
++			       current_stack_pointer) < THREAD_SIZE;
+ }
+ 
+ #ifdef CONFIG_PARAVIRT
+@@ -552,10 +568,9 @@ static inline unsigned long current_top_of_stack(void)
+ #else
+ #define __cpuid			native_cpuid
+ 
+-static inline void load_sp0(struct tss_struct *tss,
+-			    struct thread_struct *thread)
++static inline void load_sp0(unsigned long sp0)
+ {
+-	native_load_sp0(tss, thread);
++	native_load_sp0(sp0);
+ }
+ 
+ #define set_iopl_mask native_set_iopl_mask
+@@ -804,6 +819,15 @@ static inline void spin_lock_prefetch(const void *x)
+ #define TOP_OF_INIT_STACK ((unsigned long)&init_stack + sizeof(init_stack) - \
+ 			   TOP_OF_KERNEL_STACK_PADDING)
+ 
++#define task_top_of_stack(task) ((unsigned long)(task_pt_regs(task) + 1))
++
++#define task_pt_regs(task) \
++({									\
++	unsigned long __ptr = (unsigned long)task_stack_page(task);	\
++	__ptr += THREAD_SIZE - TOP_OF_KERNEL_STACK_PADDING;		\
++	((struct pt_regs *)__ptr) - 1;					\
++})
++
+ #ifdef CONFIG_X86_32
+ /*
+  * User space process size: 3GB (default).
+@@ -823,23 +847,6 @@ static inline void spin_lock_prefetch(const void *x)
+ 	.addr_limit		= KERNEL_DS,				  \
+ }
+ 
+-/*
+- * TOP_OF_KERNEL_STACK_PADDING reserves 8 bytes on top of the ring0 stack.
+- * This is necessary to guarantee that the entire "struct pt_regs"
+- * is accessible even if the CPU haven't stored the SS/ESP registers
+- * on the stack (interrupt gate does not save these registers
+- * when switching to the same priv ring).
+- * Therefore beware: accessing the ss/esp fields of the
+- * "struct pt_regs" is possible, but they may contain the
+- * completely wrong values.
+- */
+-#define task_pt_regs(task) \
+-({									\
+-	unsigned long __ptr = (unsigned long)task_stack_page(task);	\
+-	__ptr += THREAD_SIZE - TOP_OF_KERNEL_STACK_PADDING;		\
+-	((struct pt_regs *)__ptr) - 1;					\
+-})
+-
+ #define KSTK_ESP(task)		(task_pt_regs(task)->sp)
+ 
+ #else
+@@ -873,11 +880,9 @@ static inline void spin_lock_prefetch(const void *x)
+ #define STACK_TOP_MAX		TASK_SIZE_MAX
+ 
+ #define INIT_THREAD  {						\
+-	.sp0			= TOP_OF_INIT_STACK,		\
+ 	.addr_limit		= KERNEL_DS,			\
+ }
+ 
+-#define task_pt_regs(tsk)	((struct pt_regs *)(tsk)->thread.sp0 - 1)
+ extern unsigned long KSTK_ESP(struct task_struct *task);
+ 
+ #endif /* CONFIG_X86_64 */
+diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h
+index c0e3c45cf6ab..14131dd06b29 100644
+--- a/arch/x86/include/asm/ptrace.h
++++ b/arch/x86/include/asm/ptrace.h
+@@ -136,9 +136,9 @@ static inline int v8086_mode(struct pt_regs *regs)
+ #endif
+ }
+ 
+-#ifdef CONFIG_X86_64
+ static inline bool user_64bit_mode(struct pt_regs *regs)
+ {
++#ifdef CONFIG_X86_64
+ #ifndef CONFIG_PARAVIRT
+ 	/*
+ 	 * On non-paravirt systems, this is the only long mode CPL 3
+@@ -149,8 +149,12 @@ static inline bool user_64bit_mode(struct pt_regs *regs)
+ 	/* Headers are too twisted for this to go in paravirt.h. */
+ 	return regs->cs == __USER_CS || regs->cs == pv_info.extra_user_64bit_cs;
+ #endif
++#else /* !CONFIG_X86_64 */
++	return false;
++#endif
+ }
+ 
++#ifdef CONFIG_X86_64
+ #define current_user_stack_pointer()	current_pt_regs()->sp
+ #define compat_user_stack_pointer()	current_pt_regs()->sp
+ #endif
+diff --git a/arch/x86/include/asm/rmwcc.h b/arch/x86/include/asm/rmwcc.h
+index d8f3a6ae9f6c..f91c365e57c3 100644
+--- a/arch/x86/include/asm/rmwcc.h
++++ b/arch/x86/include/asm/rmwcc.h
+@@ -29,7 +29,7 @@ cc_label:								\
+ #define __GEN_RMWcc(fullop, var, cc, clobbers, ...)			\
+ do {									\
+ 	bool c;								\
+-	asm volatile (fullop ";" CC_SET(cc)				\
++	asm volatile (fullop CC_SET(cc)					\
+ 			: [counter] "+m" (var), CC_OUT(cc) (c)		\
+ 			: __VA_ARGS__ : clobbers);			\
+ 	return c;							\
+diff --git a/arch/x86/include/asm/stacktrace.h b/arch/x86/include/asm/stacktrace.h
+index 8da111b3c342..f8062bfd43a0 100644
+--- a/arch/x86/include/asm/stacktrace.h
++++ b/arch/x86/include/asm/stacktrace.h
+@@ -16,6 +16,7 @@ enum stack_type {
+ 	STACK_TYPE_TASK,
+ 	STACK_TYPE_IRQ,
+ 	STACK_TYPE_SOFTIRQ,
++	STACK_TYPE_SYSENTER,
+ 	STACK_TYPE_EXCEPTION,
+ 	STACK_TYPE_EXCEPTION_LAST = STACK_TYPE_EXCEPTION + N_EXCEPTION_STACKS-1,
+ };
+@@ -28,6 +29,8 @@ struct stack_info {
+ bool in_task_stack(unsigned long *stack, struct task_struct *task,
+ 		   struct stack_info *info);
+ 
++bool in_sysenter_stack(unsigned long *stack, struct stack_info *info);
++
+ int get_stack_info(unsigned long *stack, struct task_struct *task,
+ 		   struct stack_info *info, unsigned long *visit_mask);
+ 
+diff --git a/arch/x86/include/asm/switch_to.h b/arch/x86/include/asm/switch_to.h
+index 899084b70412..9b6df68d8fd1 100644
+--- a/arch/x86/include/asm/switch_to.h
++++ b/arch/x86/include/asm/switch_to.h
+@@ -2,6 +2,8 @@
+ #ifndef _ASM_X86_SWITCH_TO_H
+ #define _ASM_X86_SWITCH_TO_H
+ 
++#include <linux/sched/task_stack.h>
++
+ struct task_struct; /* one of the stranger aspects of C forward declarations */
+ 
+ struct task_struct *__switch_to_asm(struct task_struct *prev,
+@@ -73,4 +75,28 @@ do {									\
+ 	((last) = __switch_to_asm((prev), (next)));			\
+ } while (0)
+ 
++#ifdef CONFIG_X86_32
++static inline void refresh_sysenter_cs(struct thread_struct *thread)
++{
++	/* Only happens when SEP is enabled, no need to test "SEP"arately: */
++	if (unlikely(this_cpu_read(cpu_tss_rw.x86_tss.ss1) == thread->sysenter_cs))
++		return;
++
++	this_cpu_write(cpu_tss_rw.x86_tss.ss1, thread->sysenter_cs);
++	wrmsr(MSR_IA32_SYSENTER_CS, thread->sysenter_cs, 0);
++}
++#endif
++
++/* This is used when switching tasks or entering/exiting vm86 mode. */
++static inline void update_sp0(struct task_struct *task)
++{
++	/* On x86_64, sp0 always points to the entry trampoline stack, which is constant: */
++#ifdef CONFIG_X86_32
++	load_sp0(task->thread.sp0);
++#else
++	if (static_cpu_has(X86_FEATURE_XENPV))
++		load_sp0(task_top_of_stack(task));
++#endif
++}
++
+ #endif /* _ASM_X86_SWITCH_TO_H */
+diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
+index 70f425947dc5..00223333821a 100644
+--- a/arch/x86/include/asm/thread_info.h
++++ b/arch/x86/include/asm/thread_info.h
+@@ -207,7 +207,7 @@ static inline int arch_within_stack_frames(const void * const stack,
+ #else /* !__ASSEMBLY__ */
+ 
+ #ifdef CONFIG_X86_64
+-# define cpu_current_top_of_stack (cpu_tss + TSS_sp0)
++# define cpu_current_top_of_stack (cpu_tss_rw + TSS_sp1)
+ #endif
+ 
+ #endif
+diff --git a/arch/x86/include/asm/trace/fpu.h b/arch/x86/include/asm/trace/fpu.h
+index fa60398bbc3a..069c04be1507 100644
+--- a/arch/x86/include/asm/trace/fpu.h
++++ b/arch/x86/include/asm/trace/fpu.h
+@@ -34,11 +34,6 @@ DECLARE_EVENT_CLASS(x86_fpu,
+ 	)
+ );
+ 
+-DEFINE_EVENT(x86_fpu, x86_fpu_state,
+-	TP_PROTO(struct fpu *fpu),
+-	TP_ARGS(fpu)
+-);
+-
+ DEFINE_EVENT(x86_fpu, x86_fpu_before_save,
+ 	TP_PROTO(struct fpu *fpu),
+ 	TP_ARGS(fpu)
+@@ -74,11 +69,6 @@ DEFINE_EVENT(x86_fpu, x86_fpu_activate_state,
+ 	TP_ARGS(fpu)
+ );
+ 
+-DEFINE_EVENT(x86_fpu, x86_fpu_deactivate_state,
+-	TP_PROTO(struct fpu *fpu),
+-	TP_ARGS(fpu)
+-);
+-
+ DEFINE_EVENT(x86_fpu, x86_fpu_init_state,
+ 	TP_PROTO(struct fpu *fpu),
+ 	TP_ARGS(fpu)
+diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
+index b0cced97a6ce..31051f35cbb7 100644
+--- a/arch/x86/include/asm/traps.h
++++ b/arch/x86/include/asm/traps.h
+@@ -38,9 +38,9 @@ asmlinkage void simd_coprocessor_error(void);
+ 
+ #if defined(CONFIG_X86_64) && defined(CONFIG_XEN_PV)
+ asmlinkage void xen_divide_error(void);
++asmlinkage void xen_xennmi(void);
+ asmlinkage void xen_xendebug(void);
+ asmlinkage void xen_xenint3(void);
+-asmlinkage void xen_nmi(void);
+ asmlinkage void xen_overflow(void);
+ asmlinkage void xen_bounds(void);
+ asmlinkage void xen_invalid_op(void);
+@@ -75,7 +75,6 @@ dotraplinkage void do_segment_not_present(struct pt_regs *, long);
+ dotraplinkage void do_stack_segment(struct pt_regs *, long);
+ #ifdef CONFIG_X86_64
+ dotraplinkage void do_double_fault(struct pt_regs *, long);
+-asmlinkage struct pt_regs *sync_regs(struct pt_regs *);
+ #endif
+ dotraplinkage void do_general_protection(struct pt_regs *, long);
+ dotraplinkage void do_page_fault(struct pt_regs *, unsigned long);
+@@ -145,4 +144,22 @@ enum {
+ 	X86_TRAP_IRET = 32,	/* 32, IRET Exception */
+ };
+ 
++/*
++ * Page fault error code bits:
++ *
++ *   bit 0 ==	 0: no page found	1: protection fault
++ *   bit 1 ==	 0: read access		1: write access
++ *   bit 2 ==	 0: kernel-mode access	1: user-mode access
++ *   bit 3 ==				1: use of reserved bit detected
++ *   bit 4 ==				1: fault was an instruction fetch
++ *   bit 5 ==				1: protection keys block access
++ */
++enum x86_pf_error_code {
++	X86_PF_PROT	=		1 << 0,
++	X86_PF_WRITE	=		1 << 1,
++	X86_PF_USER	=		1 << 2,
++	X86_PF_RSVD	=		1 << 3,
++	X86_PF_INSTR	=		1 << 4,
++	X86_PF_PK	=		1 << 5,
++};
+ #endif /* _ASM_X86_TRAPS_H */
+diff --git a/arch/x86/include/asm/unwind.h b/arch/x86/include/asm/unwind.h
+index 87adc0d38c4a..c1688c2d0a12 100644
+--- a/arch/x86/include/asm/unwind.h
++++ b/arch/x86/include/asm/unwind.h
+@@ -7,17 +7,20 @@
+ #include <asm/ptrace.h>
+ #include <asm/stacktrace.h>
+ 
++#define IRET_FRAME_OFFSET (offsetof(struct pt_regs, ip))
++#define IRET_FRAME_SIZE   (sizeof(struct pt_regs) - IRET_FRAME_OFFSET)
++
+ struct unwind_state {
+ 	struct stack_info stack_info;
+ 	unsigned long stack_mask;
+ 	struct task_struct *task;
+ 	int graph_idx;
+ 	bool error;
+-#if defined(CONFIG_ORC_UNWINDER)
++#if defined(CONFIG_UNWINDER_ORC)
+ 	bool signal, full_regs;
+ 	unsigned long sp, bp, ip;
+ 	struct pt_regs *regs;
+-#elif defined(CONFIG_FRAME_POINTER_UNWINDER)
++#elif defined(CONFIG_UNWINDER_FRAME_POINTER)
+ 	bool got_irq;
+ 	unsigned long *bp, *orig_sp, ip;
+ 	struct pt_regs *regs;
+@@ -51,7 +54,11 @@ void unwind_start(struct unwind_state *state, struct task_struct *task,
+ 	__unwind_start(state, task, regs, first_frame);
+ }
+ 
+-#if defined(CONFIG_ORC_UNWINDER) || defined(CONFIG_FRAME_POINTER_UNWINDER)
++#if defined(CONFIG_UNWINDER_ORC) || defined(CONFIG_UNWINDER_FRAME_POINTER)
++/*
++ * WARNING: The entire pt_regs may not be safe to dereference.  In some cases,
++ * only the iret frame registers are accessible.  Use with caution!
++ */
+ static inline struct pt_regs *unwind_get_entry_regs(struct unwind_state *state)
+ {
+ 	if (unwind_done(state))
+@@ -66,7 +73,7 @@ static inline struct pt_regs *unwind_get_entry_regs(struct unwind_state *state)
+ }
+ #endif
+ 
+-#ifdef CONFIG_ORC_UNWINDER
++#ifdef CONFIG_UNWINDER_ORC
+ void unwind_init(void);
+ void unwind_module_init(struct module *mod, void *orc_ip, size_t orc_ip_size,
+ 			void *orc, size_t orc_size);
+diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
+index 8a1ebf9540dd..ad15a0fda917 100644
+--- a/arch/x86/include/asm/x86_init.h
++++ b/arch/x86/include/asm/x86_init.h
+@@ -114,6 +114,18 @@ struct x86_init_pci {
+ 	void (*fixup_irqs)(void);
+ };
+ 
++/**
++ * struct x86_hyper_init - x86 hypervisor init functions
++ * @init_platform:		platform setup
++ * @x2apic_available:		X2APIC detection
++ * @init_mem_mapping:		setup early mappings during init_mem_mapping()
++ */
++struct x86_hyper_init {
++	void (*init_platform)(void);
++	bool (*x2apic_available)(void);
++	void (*init_mem_mapping)(void);
++};
++
+ /**
+  * struct x86_init_ops - functions for platform specific setup
+  *
+@@ -127,6 +139,7 @@ struct x86_init_ops {
+ 	struct x86_init_timers		timers;
+ 	struct x86_init_iommu		iommu;
+ 	struct x86_init_pci		pci;
++	struct x86_hyper_init		hyper;
+ };
+ 
+ /**
+@@ -199,6 +212,15 @@ struct x86_legacy_features {
+ 	struct x86_legacy_devices devices;
+ };
+ 
++/**
++ * struct x86_hyper_runtime - x86 hypervisor specific runtime callbacks
++ *
++ * @pin_vcpu:		pin current vcpu to specified physical cpu (run rarely)
++ */
++struct x86_hyper_runtime {
++	void (*pin_vcpu)(int cpu);
++};
++
+ /**
+  * struct x86_platform_ops - platform specific runtime functions
+  * @calibrate_cpu:		calibrate CPU
+@@ -218,6 +240,7 @@ struct x86_legacy_features {
+  * 				possible in x86_early_init_platform_quirks() by
+  * 				only using the current x86_hardware_subarch
+  * 				semantics.
++ * @hyper:			x86 hypervisor specific runtime callbacks
+  */
+ struct x86_platform_ops {
+ 	unsigned long (*calibrate_cpu)(void);
+@@ -233,6 +256,7 @@ struct x86_platform_ops {
+ 	void (*apic_post_init)(void);
+ 	struct x86_legacy_features legacy;
+ 	void (*set_legacy_features)(void);
++	struct x86_hyper_runtime hyper;
+ };
+ 
+ struct pci_dev;
+diff --git a/arch/x86/include/uapi/asm/processor-flags.h b/arch/x86/include/uapi/asm/processor-flags.h
+index 6f3355399665..53b4ca55ebb6 100644
+--- a/arch/x86/include/uapi/asm/processor-flags.h
++++ b/arch/x86/include/uapi/asm/processor-flags.h
+@@ -152,5 +152,8 @@
+ #define CX86_ARR_BASE	0xc4
+ #define CX86_RCR_BASE	0xdc
+ 
++#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
++			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
++			 X86_CR0_PG)
+ 
+ #endif /* _UAPI_ASM_X86_PROCESSOR_FLAGS_H */
+diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
+index 5f70044340ff..295abaa58add 100644
+--- a/arch/x86/kernel/Makefile
++++ b/arch/x86/kernel/Makefile
+@@ -25,9 +25,9 @@ endif
+ KASAN_SANITIZE_head$(BITS).o				:= n
+ KASAN_SANITIZE_dumpstack.o				:= n
+ KASAN_SANITIZE_dumpstack_$(BITS).o			:= n
+-KASAN_SANITIZE_stacktrace.o := n
++KASAN_SANITIZE_stacktrace.o				:= n
++KASAN_SANITIZE_paravirt.o				:= n
+ 
+-OBJECT_FILES_NON_STANDARD_head_$(BITS).o		:= y
+ OBJECT_FILES_NON_STANDARD_relocate_kernel_$(BITS).o	:= y
+ OBJECT_FILES_NON_STANDARD_ftrace_$(BITS).o		:= y
+ OBJECT_FILES_NON_STANDARD_test_nx.o			:= y
+@@ -128,9 +128,9 @@ obj-$(CONFIG_PERF_EVENTS)		+= perf_regs.o
+ obj-$(CONFIG_TRACING)			+= tracepoint.o
+ obj-$(CONFIG_SCHED_MC_PRIO)		+= itmt.o
+ 
+-obj-$(CONFIG_ORC_UNWINDER)		+= unwind_orc.o
+-obj-$(CONFIG_FRAME_POINTER_UNWINDER)	+= unwind_frame.o
+-obj-$(CONFIG_GUESS_UNWINDER)		+= unwind_guess.o
++obj-$(CONFIG_UNWINDER_ORC)		+= unwind_orc.o
++obj-$(CONFIG_UNWINDER_FRAME_POINTER)	+= unwind_frame.o
++obj-$(CONFIG_UNWINDER_GUESS)		+= unwind_guess.o
+ 
+ ###
+ # 64 bit specific files
+diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
+index ff891772c9f8..89c7c8569e5e 100644
+--- a/arch/x86/kernel/apic/apic.c
++++ b/arch/x86/kernel/apic/apic.c
+@@ -1645,7 +1645,7 @@ static __init void try_to_enable_x2apic(int remap_mode)
+ 		 * under KVM
+ 		 */
+ 		if (max_physical_apicid > 255 ||
+-		    !hypervisor_x2apic_available()) {
++		    !x86_init.hyper.x2apic_available()) {
+ 			pr_info("x2apic: IRQ remapping doesn't support X2APIC mode\n");
+ 			x2apic_disable();
+ 			return;
+diff --git a/arch/x86/kernel/apic/x2apic_uv_x.c b/arch/x86/kernel/apic/x2apic_uv_x.c
+index 0d57bb9079c9..c0b694810ff4 100644
+--- a/arch/x86/kernel/apic/x2apic_uv_x.c
++++ b/arch/x86/kernel/apic/x2apic_uv_x.c
+@@ -920,9 +920,8 @@ static __init void uv_rtc_init(void)
+ /*
+  * percpu heartbeat timer
+  */
+-static void uv_heartbeat(unsigned long ignored)
++static void uv_heartbeat(struct timer_list *timer)
+ {
+-	struct timer_list *timer = &uv_scir_info->timer;
+ 	unsigned char bits = uv_scir_info->state;
+ 
+ 	/* Flip heartbeat bit: */
+@@ -947,7 +946,7 @@ static int uv_heartbeat_enable(unsigned int cpu)
+ 		struct timer_list *timer = &uv_cpu_scir_info(cpu)->timer;
+ 
+ 		uv_set_cpu_scir_bits(cpu, SCIR_CPU_HEARTBEAT|SCIR_CPU_ACTIVITY);
+-		setup_pinned_timer(timer, uv_heartbeat, cpu);
++		timer_setup(timer, uv_heartbeat, TIMER_PINNED);
+ 		timer->expires = jiffies + SCIR_CPU_HB_INTERVAL;
+ 		add_timer_on(timer, cpu);
+ 		uv_cpu_scir_info(cpu)->enabled = 1;
+diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
+index 8ea78275480d..cd360a5e0dca 100644
+--- a/arch/x86/kernel/asm-offsets.c
++++ b/arch/x86/kernel/asm-offsets.c
+@@ -93,4 +93,10 @@ void common(void) {
+ 
+ 	BLANK();
+ 	DEFINE(PTREGS_SIZE, sizeof(struct pt_regs));
++
++	/* Layout info for cpu_entry_area */
++	OFFSET(CPU_ENTRY_AREA_tss, cpu_entry_area, tss);
++	OFFSET(CPU_ENTRY_AREA_entry_trampoline, cpu_entry_area, entry_trampoline);
++	OFFSET(CPU_ENTRY_AREA_SYSENTER_stack, cpu_entry_area, SYSENTER_stack_page);
++	DEFINE(SIZEOF_SYSENTER_stack, sizeof(struct SYSENTER_stack));
+ }
+diff --git a/arch/x86/kernel/asm-offsets_32.c b/arch/x86/kernel/asm-offsets_32.c
+index dedf428b20b6..7d20d9c0b3d6 100644
+--- a/arch/x86/kernel/asm-offsets_32.c
++++ b/arch/x86/kernel/asm-offsets_32.c
+@@ -47,13 +47,8 @@ void foo(void)
+ 	BLANK();
+ 
+ 	/* Offset from the sysenter stack to tss.sp0 */
+-	DEFINE(TSS_sysenter_sp0, offsetof(struct tss_struct, x86_tss.sp0) -
+-	       offsetofend(struct tss_struct, SYSENTER_stack));
+-
+-	/* Offset from cpu_tss to SYSENTER_stack */
+-	OFFSET(CPU_TSS_SYSENTER_stack, tss_struct, SYSENTER_stack);
+-	/* Size of SYSENTER_stack */
+-	DEFINE(SIZEOF_SYSENTER_stack, sizeof(((struct tss_struct *)0)->SYSENTER_stack));
++	DEFINE(TSS_sysenter_sp0, offsetof(struct cpu_entry_area, tss.x86_tss.sp0) -
++	       offsetofend(struct cpu_entry_area, SYSENTER_stack_page.stack));
+ 
+ #ifdef CONFIG_CC_STACKPROTECTOR
+ 	BLANK();
+diff --git a/arch/x86/kernel/asm-offsets_64.c b/arch/x86/kernel/asm-offsets_64.c
+index 630212fa9b9d..bf51e51d808d 100644
+--- a/arch/x86/kernel/asm-offsets_64.c
++++ b/arch/x86/kernel/asm-offsets_64.c
+@@ -23,6 +23,9 @@ int main(void)
+ #ifdef CONFIG_PARAVIRT
+ 	OFFSET(PV_CPU_usergs_sysret64, pv_cpu_ops, usergs_sysret64);
+ 	OFFSET(PV_CPU_swapgs, pv_cpu_ops, swapgs);
++#ifdef CONFIG_DEBUG_ENTRY
++	OFFSET(PV_IRQ_save_fl, pv_irq_ops, save_fl);
++#endif
+ 	BLANK();
+ #endif
+ 
+@@ -63,6 +66,7 @@ int main(void)
+ 
+ 	OFFSET(TSS_ist, tss_struct, x86_tss.ist);
+ 	OFFSET(TSS_sp0, tss_struct, x86_tss.sp0);
++	OFFSET(TSS_sp1, tss_struct, x86_tss.sp1);
+ 	BLANK();
+ 
+ #ifdef CONFIG_CC_STACKPROTECTOR
+diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile
+index c60922a66385..90cb82dbba57 100644
+--- a/arch/x86/kernel/cpu/Makefile
++++ b/arch/x86/kernel/cpu/Makefile
+@@ -23,6 +23,7 @@ obj-y			+= rdrand.o
+ obj-y			+= match.o
+ obj-y			+= bugs.o
+ obj-$(CONFIG_CPU_FREQ)	+= aperfmperf.o
++obj-y			+= cpuid-deps.o
+ 
+ obj-$(CONFIG_PROC_FS)	+= proc.o
+ obj-$(CONFIG_X86_FEATURE_NAMES) += capflags.o powerflags.o
+diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
+index d58184b7cd44..bcb75dc97d44 100644
+--- a/arch/x86/kernel/cpu/amd.c
++++ b/arch/x86/kernel/cpu/amd.c
+@@ -804,8 +804,11 @@ static void init_amd(struct cpuinfo_x86 *c)
+ 	case 0x17: init_amd_zn(c); break;
+ 	}
+ 
+-	/* Enable workaround for FXSAVE leak */
+-	if (c->x86 >= 6)
++	/*
++	 * Enable workaround for FXSAVE leak on CPUs
++	 * without a XSaveErPtr feature
++	 */
++	if ((c->x86 >= 6) && (!cpu_has(c, X86_FEATURE_XSAVEERPTR)))
+ 		set_cpu_bug(c, X86_BUG_FXSAVE_LEAK);
+ 
+ 	cpu_detect_cache_sizes(c);
+diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
+index c9176bae7fd8..034900623adf 100644
+--- a/arch/x86/kernel/cpu/common.c
++++ b/arch/x86/kernel/cpu/common.c
+@@ -452,8 +452,8 @@ static const char *table_lookup_model(struct cpuinfo_x86 *c)
+ 	return NULL;		/* Not found */
+ }
+ 
+-__u32 cpu_caps_cleared[NCAPINTS];
+-__u32 cpu_caps_set[NCAPINTS];
++__u32 cpu_caps_cleared[NCAPINTS + NBUGINTS];
++__u32 cpu_caps_set[NCAPINTS + NBUGINTS];
+ 
+ void load_percpu_segment(int cpu)
+ {
+@@ -466,27 +466,116 @@ void load_percpu_segment(int cpu)
+ 	load_stack_canary_segment();
+ }
+ 
+-/* Setup the fixmap mapping only once per-processor */
+-static inline void setup_fixmap_gdt(int cpu)
++#ifdef CONFIG_X86_32
++/* The 32-bit entry code needs to find cpu_entry_area. */
++DEFINE_PER_CPU(struct cpu_entry_area *, cpu_entry_area);
++#endif
++
++#ifdef CONFIG_X86_64
++/*
++ * Special IST stacks which the CPU switches to when it calls
++ * an IST-marked descriptor entry. Up to 7 stacks (hardware
++ * limit), all of them are 4K, except the debug stack which
++ * is 8K.
++ */
++static const unsigned int exception_stack_sizes[N_EXCEPTION_STACKS] = {
++	  [0 ... N_EXCEPTION_STACKS - 1]	= EXCEPTION_STKSZ,
++	  [DEBUG_STACK - 1]			= DEBUG_STKSZ
++};
++
++static DEFINE_PER_CPU_PAGE_ALIGNED(char, exception_stacks
++	[(N_EXCEPTION_STACKS - 1) * EXCEPTION_STKSZ + DEBUG_STKSZ]);
++#endif
++
++static DEFINE_PER_CPU_PAGE_ALIGNED(struct SYSENTER_stack_page,
++				   SYSENTER_stack_storage);
++
++static void __init
++set_percpu_fixmap_pages(int idx, void *ptr, int pages, pgprot_t prot)
++{
++	for ( ; pages; pages--, idx--, ptr += PAGE_SIZE)
++		__set_fixmap(idx, per_cpu_ptr_to_phys(ptr), prot);
++}
++
++/* Setup the fixmap mappings only once per-processor */
++static void __init setup_cpu_entry_area(int cpu)
+ {
+ #ifdef CONFIG_X86_64
+-	/* On 64-bit systems, we use a read-only fixmap GDT. */
+-	pgprot_t prot = PAGE_KERNEL_RO;
++	extern char _entry_trampoline[];
++
++	/* On 64-bit systems, we use a read-only fixmap GDT and TSS. */
++	pgprot_t gdt_prot = PAGE_KERNEL_RO;
++	pgprot_t tss_prot = PAGE_KERNEL_RO;
+ #else
+ 	/*
+ 	 * On native 32-bit systems, the GDT cannot be read-only because
+ 	 * our double fault handler uses a task gate, and entering through
+-	 * a task gate needs to change an available TSS to busy.  If the GDT
+-	 * is read-only, that will triple fault.
++	 * a task gate needs to change an available TSS to busy.  If the
++	 * GDT is read-only, that will triple fault.  The TSS cannot be
++	 * read-only because the CPU writes to it on task switches.
+ 	 *
+-	 * On Xen PV, the GDT must be read-only because the hypervisor requires
+-	 * it.
++	 * On Xen PV, the GDT must be read-only because the hypervisor
++	 * requires it.
+ 	 */
+-	pgprot_t prot = boot_cpu_has(X86_FEATURE_XENPV) ?
++	pgprot_t gdt_prot = boot_cpu_has(X86_FEATURE_XENPV) ?
+ 		PAGE_KERNEL_RO : PAGE_KERNEL;
++	pgprot_t tss_prot = PAGE_KERNEL;
++#endif
++
++	__set_fixmap(get_cpu_entry_area_index(cpu, gdt), get_cpu_gdt_paddr(cpu), gdt_prot);
++	set_percpu_fixmap_pages(get_cpu_entry_area_index(cpu, SYSENTER_stack_page),
++				per_cpu_ptr(&SYSENTER_stack_storage, cpu), 1,
++				PAGE_KERNEL);
++
++	/*
++	 * The Intel SDM says (Volume 3, 7.2.1):
++	 *
++	 *  Avoid placing a page boundary in the part of the TSS that the
++	 *  processor reads during a task switch (the first 104 bytes). The
++	 *  processor may not correctly perform address translations if a
++	 *  boundary occurs in this area. During a task switch, the processor
++	 *  reads and writes into the first 104 bytes of each TSS (using
++	 *  contiguous physical addresses beginning with the physical address
++	 *  of the first byte of the TSS). So, after TSS access begins, if
++	 *  part of the 104 bytes is not physically contiguous, the processor
++	 *  will access incorrect information without generating a page-fault
++	 *  exception.
++	 *
++	 * There are also a lot of errata involving the TSS spanning a page
++	 * boundary.  Assert that we're not doing that.
++	 */
++	BUILD_BUG_ON((offsetof(struct tss_struct, x86_tss) ^
++		      offsetofend(struct tss_struct, x86_tss)) & PAGE_MASK);
++	BUILD_BUG_ON(sizeof(struct tss_struct) % PAGE_SIZE != 0);
++	set_percpu_fixmap_pages(get_cpu_entry_area_index(cpu, tss),
++				&per_cpu(cpu_tss_rw, cpu),
++				sizeof(struct tss_struct) / PAGE_SIZE,
++				tss_prot);
++
++#ifdef CONFIG_X86_32
++	per_cpu(cpu_entry_area, cpu) = get_cpu_entry_area(cpu);
+ #endif
+ 
+-	__set_fixmap(get_cpu_gdt_ro_index(cpu), get_cpu_gdt_paddr(cpu), prot);
++#ifdef CONFIG_X86_64
++	BUILD_BUG_ON(sizeof(exception_stacks) % PAGE_SIZE != 0);
++	BUILD_BUG_ON(sizeof(exception_stacks) !=
++		     sizeof(((struct cpu_entry_area *)0)->exception_stacks));
++	set_percpu_fixmap_pages(get_cpu_entry_area_index(cpu, exception_stacks),
++				&per_cpu(exception_stacks, cpu),
++				sizeof(exception_stacks) / PAGE_SIZE,
++				PAGE_KERNEL);
++
++	__set_fixmap(get_cpu_entry_area_index(cpu, entry_trampoline),
++		     __pa_symbol(_entry_trampoline), PAGE_KERNEL_RX);
++#endif
++}
++
++void __init setup_cpu_entry_areas(void)
++{
++	unsigned int cpu;
++
++	for_each_possible_cpu(cpu)
++		setup_cpu_entry_area(cpu);
+ }
+ 
+ /* Load the original GDT from the per-cpu structure */
+@@ -723,7 +812,7 @@ static void apply_forced_caps(struct cpuinfo_x86 *c)
+ {
+ 	int i;
+ 
+-	for (i = 0; i < NCAPINTS; i++) {
++	for (i = 0; i < NCAPINTS + NBUGINTS; i++) {
+ 		c->x86_capability[i] &= ~cpu_caps_cleared[i];
+ 		c->x86_capability[i] |= cpu_caps_set[i];
+ 	}
+@@ -1225,7 +1314,7 @@ void enable_sep_cpu(void)
+ 		return;
+ 
+ 	cpu = get_cpu();
+-	tss = &per_cpu(cpu_tss, cpu);
++	tss = &per_cpu(cpu_tss_rw, cpu);
+ 
+ 	/*
+ 	 * We cache MSR_IA32_SYSENTER_CS's value in the TSS's ss1 field --
+@@ -1234,11 +1323,7 @@ void enable_sep_cpu(void)
+ 
+ 	tss->x86_tss.ss1 = __KERNEL_CS;
+ 	wrmsr(MSR_IA32_SYSENTER_CS, tss->x86_tss.ss1, 0);
+-
+-	wrmsr(MSR_IA32_SYSENTER_ESP,
+-	      (unsigned long)tss + offsetofend(struct tss_struct, SYSENTER_stack),
+-	      0);
+-
++	wrmsr(MSR_IA32_SYSENTER_ESP, (unsigned long)(cpu_SYSENTER_stack(cpu) + 1), 0);
+ 	wrmsr(MSR_IA32_SYSENTER_EIP, (unsigned long)entry_SYSENTER_32, 0);
+ 
+ 	put_cpu();
+@@ -1301,18 +1386,16 @@ void print_cpu_info(struct cpuinfo_x86 *c)
+ 		pr_cont(")\n");
+ }
+ 
+-static __init int setup_disablecpuid(char *arg)
++/*
++ * clearcpuid= was already parsed in fpu__init_parse_early_param.
++ * But we need to keep a dummy __setup around otherwise it would
++ * show up as an environment variable for init.
++ */
++static __init int setup_clearcpuid(char *arg)
+ {
+-	int bit;
+-
+-	if (get_option(&arg, &bit) && bit >= 0 && bit < NCAPINTS * 32)
+-		setup_clear_cpu_cap(bit);
+-	else
+-		return 0;
+-
+ 	return 1;
+ }
+-__setup("clearcpuid=", setup_disablecpuid);
++__setup("clearcpuid=", setup_clearcpuid);
+ 
+ #ifdef CONFIG_X86_64
+ DEFINE_PER_CPU_FIRST(union irq_stack_union,
+@@ -1334,25 +1417,19 @@ DEFINE_PER_CPU(unsigned int, irq_count) __visible = -1;
+ DEFINE_PER_CPU(int, __preempt_count) = INIT_PREEMPT_COUNT;
+ EXPORT_PER_CPU_SYMBOL(__preempt_count);
+ 
+-/*
+- * Special IST stacks which the CPU switches to when it calls
+- * an IST-marked descriptor entry. Up to 7 stacks (hardware
+- * limit), all of them are 4K, except the debug stack which
+- * is 8K.
+- */
+-static const unsigned int exception_stack_sizes[N_EXCEPTION_STACKS] = {
+-	  [0 ... N_EXCEPTION_STACKS - 1]	= EXCEPTION_STKSZ,
+-	  [DEBUG_STACK - 1]			= DEBUG_STKSZ
+-};
+-
+-static DEFINE_PER_CPU_PAGE_ALIGNED(char, exception_stacks
+-	[(N_EXCEPTION_STACKS - 1) * EXCEPTION_STKSZ + DEBUG_STKSZ]);
+-
+ /* May not be marked __init: used by software suspend */
+ void syscall_init(void)
+ {
++	extern char _entry_trampoline[];
++	extern char entry_SYSCALL_64_trampoline[];
++
++	int cpu = smp_processor_id();
++	unsigned long SYSCALL64_entry_trampoline =
++		(unsigned long)get_cpu_entry_area(cpu)->entry_trampoline +
++		(entry_SYSCALL_64_trampoline - _entry_trampoline);
++
+ 	wrmsr(MSR_STAR, 0, (__USER32_CS << 16) | __KERNEL_CS);
+-	wrmsrl(MSR_LSTAR, (unsigned long)entry_SYSCALL_64);
++	wrmsrl(MSR_LSTAR, SYSCALL64_entry_trampoline);
+ 
+ #ifdef CONFIG_IA32_EMULATION
+ 	wrmsrl(MSR_CSTAR, (unsigned long)entry_SYSCALL_compat);
+@@ -1363,7 +1440,7 @@ void syscall_init(void)
+ 	 * AMD doesn't allow SYSENTER in long mode (either 32- or 64-bit).
+ 	 */
+ 	wrmsrl_safe(MSR_IA32_SYSENTER_CS, (u64)__KERNEL_CS);
+-	wrmsrl_safe(MSR_IA32_SYSENTER_ESP, 0ULL);
++	wrmsrl_safe(MSR_IA32_SYSENTER_ESP, (unsigned long)(cpu_SYSENTER_stack(cpu) + 1));
+ 	wrmsrl_safe(MSR_IA32_SYSENTER_EIP, (u64)entry_SYSENTER_compat);
+ #else
+ 	wrmsrl(MSR_CSTAR, (unsigned long)ignore_sysret);
+@@ -1507,7 +1584,7 @@ void cpu_init(void)
+ 	if (cpu)
+ 		load_ucode_ap();
+ 
+-	t = &per_cpu(cpu_tss, cpu);
++	t = &per_cpu(cpu_tss_rw, cpu);
+ 	oist = &per_cpu(orig_ist, cpu);
+ 
+ #ifdef CONFIG_NUMA
+@@ -1546,7 +1623,7 @@ void cpu_init(void)
+ 	 * set up and load the per-CPU TSS
+ 	 */
+ 	if (!oist->ist[0]) {
+-		char *estacks = per_cpu(exception_stacks, cpu);
++		char *estacks = get_cpu_entry_area(cpu)->exception_stacks;
+ 
+ 		for (v = 0; v < N_EXCEPTION_STACKS; v++) {
+ 			estacks += exception_stack_sizes[v];
+@@ -1557,7 +1634,7 @@ void cpu_init(void)
+ 		}
+ 	}
+ 
+-	t->x86_tss.io_bitmap_base = offsetof(struct tss_struct, io_bitmap);
++	t->x86_tss.io_bitmap_base = IO_BITMAP_OFFSET;
+ 
+ 	/*
+ 	 * <= is required because the CPU will access up to
+@@ -1572,9 +1649,14 @@ void cpu_init(void)
+ 	initialize_tlbstate_and_flush();
+ 	enter_lazy_tlb(&init_mm, me);
+ 
+-	load_sp0(t, &current->thread);
+-	set_tss_desc(cpu, t);
++	/*
++	 * Initialize the TSS.  sp0 points to the entry trampoline stack
++	 * regardless of what task is running.
++	 */
++	set_tss_desc(cpu, &get_cpu_entry_area(cpu)->tss.x86_tss);
+ 	load_TR_desc();
++	load_sp0((unsigned long)(cpu_SYSENTER_stack(cpu) + 1));
++
+ 	load_mm_ldt(&init_mm);
+ 
+ 	clear_all_debug_regs();
+@@ -1585,7 +1667,6 @@ void cpu_init(void)
+ 	if (is_uv_system())
+ 		uv_cpu_init();
+ 
+-	setup_fixmap_gdt(cpu);
+ 	load_fixmap_gdt(cpu);
+ }
+ 
+@@ -1595,8 +1676,7 @@ void cpu_init(void)
+ {
+ 	int cpu = smp_processor_id();
+ 	struct task_struct *curr = current;
+-	struct tss_struct *t = &per_cpu(cpu_tss, cpu);
+-	struct thread_struct *thread = &curr->thread;
++	struct tss_struct *t = &per_cpu(cpu_tss_rw, cpu);
+ 
+ 	wait_for_master_cpu(cpu);
+ 
+@@ -1627,12 +1707,16 @@ void cpu_init(void)
+ 	initialize_tlbstate_and_flush();
+ 	enter_lazy_tlb(&init_mm, curr);
+ 
+-	load_sp0(t, thread);
+-	set_tss_desc(cpu, t);
++	/*
++	 * Initialize the TSS.  Don't bother initializing sp0, as the initial
++	 * task never enters user mode.
++	 */
++	set_tss_desc(cpu, &get_cpu_entry_area(cpu)->tss.x86_tss);
+ 	load_TR_desc();
++
+ 	load_mm_ldt(&init_mm);
+ 
+-	t->x86_tss.io_bitmap_base = offsetof(struct tss_struct, io_bitmap);
++	t->x86_tss.io_bitmap_base = IO_BITMAP_OFFSET;
+ 
+ #ifdef CONFIG_DOUBLEFAULT
+ 	/* Set up doublefault TSS pointer in the GDT */
+@@ -1644,7 +1728,6 @@ void cpu_init(void)
+ 
+ 	fpu__init_cpu();
+ 
+-	setup_fixmap_gdt(cpu);
+ 	load_fixmap_gdt(cpu);
+ }
+ #endif
+diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
+new file mode 100644
+index 000000000000..904b0a3c4e53
+--- /dev/null
++++ b/arch/x86/kernel/cpu/cpuid-deps.c
+@@ -0,0 +1,121 @@
++/* Declare dependencies between CPUIDs */
++#include <linux/kernel.h>
++#include <linux/init.h>
++#include <linux/module.h>
++#include <asm/cpufeature.h>
++
++struct cpuid_dep {
++	unsigned int	feature;
++	unsigned int	depends;
++};
++
++/*
++ * Table of CPUID features that depend on others.
++ *
++ * This only includes dependencies that can be usefully disabled, not
++ * features part of the base set (like FPU).
++ *
++ * Note this all is not __init / __initdata because it can be
++ * called from cpu hotplug. It shouldn't do anything in this case,
++ * but it's difficult to tell that to the init reference checker.
++ */
++const static struct cpuid_dep cpuid_deps[] = {
++	{ X86_FEATURE_XSAVEOPT,		X86_FEATURE_XSAVE     },
++	{ X86_FEATURE_XSAVEC,		X86_FEATURE_XSAVE     },
++	{ X86_FEATURE_XSAVES,		X86_FEATURE_XSAVE     },
++	{ X86_FEATURE_AVX,		X86_FEATURE_XSAVE     },
++	{ X86_FEATURE_PKU,		X86_FEATURE_XSAVE     },
++	{ X86_FEATURE_MPX,		X86_FEATURE_XSAVE     },
++	{ X86_FEATURE_XGETBV1,		X86_FEATURE_XSAVE     },
++	{ X86_FEATURE_FXSR_OPT,		X86_FEATURE_FXSR      },
++	{ X86_FEATURE_XMM,		X86_FEATURE_FXSR      },
++	{ X86_FEATURE_XMM2,		X86_FEATURE_XMM       },
++	{ X86_FEATURE_XMM3,		X86_FEATURE_XMM2      },
++	{ X86_FEATURE_XMM4_1,		X86_FEATURE_XMM2      },
++	{ X86_FEATURE_XMM4_2,		X86_FEATURE_XMM2      },
++	{ X86_FEATURE_XMM3,		X86_FEATURE_XMM2      },
++	{ X86_FEATURE_PCLMULQDQ,	X86_FEATURE_XMM2      },
++	{ X86_FEATURE_SSSE3,		X86_FEATURE_XMM2,     },
++	{ X86_FEATURE_F16C,		X86_FEATURE_XMM2,     },
++	{ X86_FEATURE_AES,		X86_FEATURE_XMM2      },
++	{ X86_FEATURE_SHA_NI,		X86_FEATURE_XMM2      },
++	{ X86_FEATURE_FMA,		X86_FEATURE_AVX       },
++	{ X86_FEATURE_AVX2,		X86_FEATURE_AVX,      },
++	{ X86_FEATURE_AVX512F,		X86_FEATURE_AVX,      },
++	{ X86_FEATURE_AVX512IFMA,	X86_FEATURE_AVX512F   },
++	{ X86_FEATURE_AVX512PF,		X86_FEATURE_AVX512F   },
++	{ X86_FEATURE_AVX512ER,		X86_FEATURE_AVX512F   },
++	{ X86_FEATURE_AVX512CD,		X86_FEATURE_AVX512F   },
++	{ X86_FEATURE_AVX512DQ,		X86_FEATURE_AVX512F   },
++	{ X86_FEATURE_AVX512BW,		X86_FEATURE_AVX512F   },
++	{ X86_FEATURE_AVX512VL,		X86_FEATURE_AVX512F   },
++	{ X86_FEATURE_AVX512VBMI,	X86_FEATURE_AVX512F   },
++	{ X86_FEATURE_AVX512_VBMI2,	X86_FEATURE_AVX512VL  },
++	{ X86_FEATURE_GFNI,		X86_FEATURE_AVX512VL  },
++	{ X86_FEATURE_VAES,		X86_FEATURE_AVX512VL  },
++	{ X86_FEATURE_VPCLMULQDQ,	X86_FEATURE_AVX512VL  },
++	{ X86_FEATURE_AVX512_VNNI,	X86_FEATURE_AVX512VL  },
++	{ X86_FEATURE_AVX512_BITALG,	X86_FEATURE_AVX512VL  },
++	{ X86_FEATURE_AVX512_4VNNIW,	X86_FEATURE_AVX512F   },
++	{ X86_FEATURE_AVX512_4FMAPS,	X86_FEATURE_AVX512F   },
++	{ X86_FEATURE_AVX512_VPOPCNTDQ, X86_FEATURE_AVX512F   },
++	{}
++};
++
++static inline void clear_feature(struct cpuinfo_x86 *c, unsigned int feature)
++{
++	/*
++	 * Note: This could use the non atomic __*_bit() variants, but the
++	 * rest of the cpufeature code uses atomics as well, so keep it for
++	 * consistency. Cleanup all of it separately.
++	 */
++	if (!c) {
++		clear_cpu_cap(&boot_cpu_data, feature);
++		set_bit(feature, (unsigned long *)cpu_caps_cleared);
++	} else {
++		clear_bit(feature, (unsigned long *)c->x86_capability);
++	}
++}
++
++/* Take the capabilities and the BUG bits into account */
++#define MAX_FEATURE_BITS ((NCAPINTS + NBUGINTS) * sizeof(u32) * 8)
++
++static void do_clear_cpu_cap(struct cpuinfo_x86 *c, unsigned int feature)
++{
++	DECLARE_BITMAP(disable, MAX_FEATURE_BITS);
++	const struct cpuid_dep *d;
++	bool changed;
++
++	if (WARN_ON(feature >= MAX_FEATURE_BITS))
++		return;
++
++	clear_feature(c, feature);
++
++	/* Collect all features to disable, handling dependencies */
++	memset(disable, 0, sizeof(disable));
++	__set_bit(feature, disable);
++
++	/* Loop until we get a stable state. */
++	do {
++		changed = false;
++		for (d = cpuid_deps; d->feature; d++) {
++			if (!test_bit(d->depends, disable))
++				continue;
++			if (__test_and_set_bit(d->feature, disable))
++				continue;
++
++			changed = true;
++			clear_feature(c, d->feature);
++		}
++	} while (changed);
++}
++
++void clear_cpu_cap(struct cpuinfo_x86 *c, unsigned int feature)
++{
++	do_clear_cpu_cap(c, feature);
++}
++
++void setup_clear_cpu_cap(unsigned int feature)
++{
++	do_clear_cpu_cap(NULL, feature);
++}
+diff --git a/arch/x86/kernel/cpu/hypervisor.c b/arch/x86/kernel/cpu/hypervisor.c
+index 4fa90006ac68..bea8d3e24f50 100644
+--- a/arch/x86/kernel/cpu/hypervisor.c
++++ b/arch/x86/kernel/cpu/hypervisor.c
+@@ -26,6 +26,12 @@
+ #include <asm/processor.h>
+ #include <asm/hypervisor.h>
+ 
++extern const struct hypervisor_x86 x86_hyper_vmware;
++extern const struct hypervisor_x86 x86_hyper_ms_hyperv;
++extern const struct hypervisor_x86 x86_hyper_xen_pv;
++extern const struct hypervisor_x86 x86_hyper_xen_hvm;
++extern const struct hypervisor_x86 x86_hyper_kvm;
++
+ static const __initconst struct hypervisor_x86 * const hypervisors[] =
+ {
+ #ifdef CONFIG_XEN_PV
+@@ -41,54 +47,52 @@ static const __initconst struct hypervisor_x86 * const hypervisors[] =
+ #endif
+ };
+ 
+-const struct hypervisor_x86 *x86_hyper;
+-EXPORT_SYMBOL(x86_hyper);
++enum x86_hypervisor_type x86_hyper_type;
++EXPORT_SYMBOL(x86_hyper_type);
+ 
+-static inline void __init
++static inline const struct hypervisor_x86 * __init
+ detect_hypervisor_vendor(void)
+ {
+-	const struct hypervisor_x86 *h, * const *p;
++	const struct hypervisor_x86 *h = NULL, * const *p;
+ 	uint32_t pri, max_pri = 0;
+ 
+ 	for (p = hypervisors; p < hypervisors + ARRAY_SIZE(hypervisors); p++) {
+-		h = *p;
+-		pri = h->detect();
+-		if (pri != 0 && pri > max_pri) {
++		pri = (*p)->detect();
++		if (pri > max_pri) {
+ 			max_pri = pri;
+-			x86_hyper = h;
++			h = *p;
+ 		}
+ 	}
+ 
+-	if (max_pri)
+-		pr_info("Hypervisor detected: %s\n", x86_hyper->name);
++	if (h)
++		pr_info("Hypervisor detected: %s\n", h->name);
++
++	return h;
+ }
+ 
+-void __init init_hypervisor_platform(void)
++static void __init copy_array(const void *src, void *target, unsigned int size)
+ {
++	unsigned int i, n = size / sizeof(void *);
++	const void * const *from = (const void * const *)src;
++	const void **to = (const void **)target;
+ 
+-	detect_hypervisor_vendor();
+-
+-	if (!x86_hyper)
+-		return;
+-
+-	if (x86_hyper->init_platform)
+-		x86_hyper->init_platform();
++	for (i = 0; i < n; i++)
++		if (from[i])
++			to[i] = from[i];
+ }
+ 
+-bool __init hypervisor_x2apic_available(void)
++void __init init_hypervisor_platform(void)
+ {
+-	return x86_hyper                   &&
+-	       x86_hyper->x2apic_available &&
+-	       x86_hyper->x2apic_available();
+-}
++	const struct hypervisor_x86 *h;
+ 
+-void hypervisor_pin_vcpu(int cpu)
+-{
+-	if (!x86_hyper)
++	h = detect_hypervisor_vendor();
++
++	if (!h)
+ 		return;
+ 
+-	if (x86_hyper->pin_vcpu)
+-		x86_hyper->pin_vcpu(cpu);
+-	else
+-		WARN_ONCE(1, "vcpu pinning requested but not supported!\n");
++	copy_array(&h->init, &x86_init.hyper, sizeof(h->init));
++	copy_array(&h->runtime, &x86_platform.hyper, sizeof(h->runtime));
++
++	x86_hyper_type = h->type;
++	x86_init.hyper.init_platform();
+ }
+diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
+index 236324e83a3a..85eb5fc180c8 100644
+--- a/arch/x86/kernel/cpu/mshyperv.c
++++ b/arch/x86/kernel/cpu/mshyperv.c
+@@ -254,9 +254,9 @@ static void __init ms_hyperv_init_platform(void)
+ #endif
+ }
+ 
+-const __refconst struct hypervisor_x86 x86_hyper_ms_hyperv = {
++const __initconst struct hypervisor_x86 x86_hyper_ms_hyperv = {
+ 	.name			= "Microsoft Hyper-V",
+ 	.detect			= ms_hyperv_platform,
+-	.init_platform		= ms_hyperv_init_platform,
++	.type			= X86_HYPER_MS_HYPERV,
++	.init.init_platform	= ms_hyperv_init_platform,
+ };
+-EXPORT_SYMBOL(x86_hyper_ms_hyperv);
+diff --git a/arch/x86/kernel/cpu/vmware.c b/arch/x86/kernel/cpu/vmware.c
+index 40ed26852ebd..8e005329648b 100644
+--- a/arch/x86/kernel/cpu/vmware.c
++++ b/arch/x86/kernel/cpu/vmware.c
+@@ -205,10 +205,10 @@ static bool __init vmware_legacy_x2apic_available(void)
+ 	       (eax & (1 << VMWARE_PORT_CMD_LEGACY_X2APIC)) != 0;
+ }
+ 
+-const __refconst struct hypervisor_x86 x86_hyper_vmware = {
++const __initconst struct hypervisor_x86 x86_hyper_vmware = {
+ 	.name			= "VMware",
+ 	.detect			= vmware_platform,
+-	.init_platform		= vmware_platform_setup,
+-	.x2apic_available	= vmware_legacy_x2apic_available,
++	.type			= X86_HYPER_VMWARE,
++	.init.init_platform	= vmware_platform_setup,
++	.init.x2apic_available	= vmware_legacy_x2apic_available,
+ };
+-EXPORT_SYMBOL(x86_hyper_vmware);
+diff --git a/arch/x86/kernel/doublefault.c b/arch/x86/kernel/doublefault.c
+index 0e662c55ae90..0b8cedb20d6d 100644
+--- a/arch/x86/kernel/doublefault.c
++++ b/arch/x86/kernel/doublefault.c
+@@ -50,25 +50,23 @@ static void doublefault_fn(void)
+ 		cpu_relax();
+ }
+ 
+-struct tss_struct doublefault_tss __cacheline_aligned = {
+-	.x86_tss = {
+-		.sp0		= STACK_START,
+-		.ss0		= __KERNEL_DS,
+-		.ldt		= 0,
+-		.io_bitmap_base	= INVALID_IO_BITMAP_OFFSET,
+-
+-		.ip		= (unsigned long) doublefault_fn,
+-		/* 0x2 bit is always set */
+-		.flags		= X86_EFLAGS_SF | 0x2,
+-		.sp		= STACK_START,
+-		.es		= __USER_DS,
+-		.cs		= __KERNEL_CS,
+-		.ss		= __KERNEL_DS,
+-		.ds		= __USER_DS,
+-		.fs		= __KERNEL_PERCPU,
+-
+-		.__cr3		= __pa_nodebug(swapper_pg_dir),
+-	}
++struct x86_hw_tss doublefault_tss __cacheline_aligned = {
++	.sp0		= STACK_START,
++	.ss0		= __KERNEL_DS,
++	.ldt		= 0,
++	.io_bitmap_base	= INVALID_IO_BITMAP_OFFSET,
++
++	.ip		= (unsigned long) doublefault_fn,
++	/* 0x2 bit is always set */
++	.flags		= X86_EFLAGS_SF | 0x2,
++	.sp		= STACK_START,
++	.es		= __USER_DS,
++	.cs		= __KERNEL_CS,
++	.ss		= __KERNEL_DS,
++	.ds		= __USER_DS,
++	.fs		= __KERNEL_PERCPU,
++
++	.__cr3		= __pa_nodebug(swapper_pg_dir),
+ };
+ 
+ /* dummy for do_double_fault() call */
+diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
+index f13b4c00a5de..bbd6d986e2d0 100644
+--- a/arch/x86/kernel/dumpstack.c
++++ b/arch/x86/kernel/dumpstack.c
+@@ -43,6 +43,24 @@ bool in_task_stack(unsigned long *stack, struct task_struct *task,
+ 	return true;
+ }
+ 
++bool in_sysenter_stack(unsigned long *stack, struct stack_info *info)
++{
++	struct SYSENTER_stack *ss = cpu_SYSENTER_stack(smp_processor_id());
++
++	void *begin = ss;
++	void *end = ss + 1;
++
++	if ((void *)stack < begin || (void *)stack >= end)
++		return false;
++
++	info->type	= STACK_TYPE_SYSENTER;
++	info->begin	= begin;
++	info->end	= end;
++	info->next_sp	= NULL;
++
++	return true;
++}
++
+ static void printk_stack_address(unsigned long address, int reliable,
+ 				 char *log_lvl)
+ {
+@@ -50,6 +68,28 @@ static void printk_stack_address(unsigned long address, int reliable,
+ 	printk("%s %s%pB\n", log_lvl, reliable ? "" : "? ", (void *)address);
+ }
+ 
++void show_iret_regs(struct pt_regs *regs)
++{
++	printk(KERN_DEFAULT "RIP: %04x:%pS\n", (int)regs->cs, (void *)regs->ip);
++	printk(KERN_DEFAULT "RSP: %04x:%016lx EFLAGS: %08lx", (int)regs->ss,
++		regs->sp, regs->flags);
++}
++
++static void show_regs_safe(struct stack_info *info, struct pt_regs *regs)
++{
++	if (on_stack(info, regs, sizeof(*regs)))
++		__show_regs(regs, 0);
++	else if (on_stack(info, (void *)regs + IRET_FRAME_OFFSET,
++			  IRET_FRAME_SIZE)) {
++		/*
++		 * When an interrupt or exception occurs in entry code, the
++		 * full pt_regs might not have been saved yet.  In that case
++		 * just print the iret frame.
++		 */
++		show_iret_regs(regs);
++	}
++}
++
+ void show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
+ 			unsigned long *stack, char *log_lvl)
+ {
+@@ -71,31 +111,35 @@ void show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
+ 	 * - task stack
+ 	 * - interrupt stack
+ 	 * - HW exception stacks (double fault, nmi, debug, mce)
++	 * - SYSENTER stack
+ 	 *
+-	 * x86-32 can have up to three stacks:
++	 * x86-32 can have up to four stacks:
+ 	 * - task stack
+ 	 * - softirq stack
+ 	 * - hardirq stack
++	 * - SYSENTER stack
+ 	 */
+ 	for (regs = NULL; stack; stack = PTR_ALIGN(stack_info.next_sp, sizeof(long))) {
+ 		const char *stack_name;
+ 
+-		/*
+-		 * If we overflowed the task stack into a guard page, jump back
+-		 * to the bottom of the usable stack.
+-		 */
+-		if (task_stack_page(task) - (void *)stack < PAGE_SIZE)
+-			stack = task_stack_page(task);
+-
+-		if (get_stack_info(stack, task, &stack_info, &visit_mask))
+-			break;
++		if (get_stack_info(stack, task, &stack_info, &visit_mask)) {
++			/*
++			 * We weren't on a valid stack.  It's possible that
++			 * we overflowed a valid stack into a guard page.
++			 * See if the next page up is valid so that we can
++			 * generate some kind of backtrace if this happens.
++			 */
++			stack = (unsigned long *)PAGE_ALIGN((unsigned long)stack);
++			if (get_stack_info(stack, task, &stack_info, &visit_mask))
++				break;
++		}
+ 
+ 		stack_name = stack_type_name(stack_info.type);
+ 		if (stack_name)
+ 			printk("%s <%s>\n", log_lvl, stack_name);
+ 
+-		if (regs && on_stack(&stack_info, regs, sizeof(*regs)))
+-			__show_regs(regs, 0);
++		if (regs)
++			show_regs_safe(&stack_info, regs);
+ 
+ 		/*
+ 		 * Scan the stack, printing any text addresses we find.  At the
+@@ -119,7 +163,7 @@ void show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
+ 
+ 			/*
+ 			 * Don't print regs->ip again if it was already printed
+-			 * by __show_regs() below.
++			 * by show_regs_safe() below.
+ 			 */
+ 			if (regs && stack == &regs->ip)
+ 				goto next;
+@@ -155,8 +199,8 @@ void show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
+ 
+ 			/* if the frame has entry regs, print them */
+ 			regs = unwind_get_entry_regs(&state);
+-			if (regs && on_stack(&stack_info, regs, sizeof(*regs)))
+-				__show_regs(regs, 0);
++			if (regs)
++				show_regs_safe(&stack_info, regs);
+ 		}
+ 
+ 		if (stack_name)
+diff --git a/arch/x86/kernel/dumpstack_32.c b/arch/x86/kernel/dumpstack_32.c
+index daefae83a3aa..5ff13a6b3680 100644
+--- a/arch/x86/kernel/dumpstack_32.c
++++ b/arch/x86/kernel/dumpstack_32.c
+@@ -26,6 +26,9 @@ const char *stack_type_name(enum stack_type type)
+ 	if (type == STACK_TYPE_SOFTIRQ)
+ 		return "SOFTIRQ";
+ 
++	if (type == STACK_TYPE_SYSENTER)
++		return "SYSENTER";
++
+ 	return NULL;
+ }
+ 
+@@ -93,6 +96,9 @@ int get_stack_info(unsigned long *stack, struct task_struct *task,
+ 	if (task != current)
+ 		goto unknown;
+ 
++	if (in_sysenter_stack(stack, info))
++		goto recursion_check;
++
+ 	if (in_hardirq_stack(stack, info))
+ 		goto recursion_check;
+ 
+diff --git a/arch/x86/kernel/dumpstack_64.c b/arch/x86/kernel/dumpstack_64.c
+index 88ce2ffdb110..abc828f8c297 100644
+--- a/arch/x86/kernel/dumpstack_64.c
++++ b/arch/x86/kernel/dumpstack_64.c
+@@ -37,6 +37,9 @@ const char *stack_type_name(enum stack_type type)
+ 	if (type == STACK_TYPE_IRQ)
+ 		return "IRQ";
+ 
++	if (type == STACK_TYPE_SYSENTER)
++		return "SYSENTER";
++
+ 	if (type >= STACK_TYPE_EXCEPTION && type <= STACK_TYPE_EXCEPTION_LAST)
+ 		return exception_stack_names[type - STACK_TYPE_EXCEPTION];
+ 
+@@ -115,6 +118,9 @@ int get_stack_info(unsigned long *stack, struct task_struct *task,
+ 	if (in_irq_stack(stack, info))
+ 		goto recursion_check;
+ 
++	if (in_sysenter_stack(stack, info))
++		goto recursion_check;
++
+ 	goto unknown;
+ 
+ recursion_check:
+diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
+index 7affb7e3d9a5..6abd83572b01 100644
+--- a/arch/x86/kernel/fpu/init.c
++++ b/arch/x86/kernel/fpu/init.c
+@@ -249,6 +249,10 @@ static void __init fpu__init_system_ctx_switch(void)
+  */
+ static void __init fpu__init_parse_early_param(void)
+ {
++	char arg[32];
++	char *argptr = arg;
++	int bit;
++
+ 	if (cmdline_find_option_bool(boot_command_line, "no387"))
+ 		setup_clear_cpu_cap(X86_FEATURE_FPU);
+ 
+@@ -266,6 +270,13 @@ static void __init fpu__init_parse_early_param(void)
+ 
+ 	if (cmdline_find_option_bool(boot_command_line, "noxsaves"))
+ 		setup_clear_cpu_cap(X86_FEATURE_XSAVES);
++
++	if (cmdline_find_option(boot_command_line, "clearcpuid", arg,
++				sizeof(arg)) &&
++	    get_option(&argptr, &bit) &&
++	    bit >= 0 &&
++	    bit < NCAPINTS * 32)
++		setup_clear_cpu_cap(bit);
+ }
+ 
+ /*
+diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
+index f1d5476c9022..87a57b7642d3 100644
+--- a/arch/x86/kernel/fpu/xstate.c
++++ b/arch/x86/kernel/fpu/xstate.c
+@@ -15,6 +15,7 @@
+ #include <asm/fpu/xstate.h>
+ 
+ #include <asm/tlbflush.h>
++#include <asm/cpufeature.h>
+ 
+ /*
+  * Although we spell it out in here, the Processor Trace
+@@ -36,6 +37,19 @@ static const char *xfeature_names[] =
+ 	"unknown xstate feature"	,
+ };
+ 
++static short xsave_cpuid_features[] __initdata = {
++	X86_FEATURE_FPU,
++	X86_FEATURE_XMM,
++	X86_FEATURE_AVX,
++	X86_FEATURE_MPX,
++	X86_FEATURE_MPX,
++	X86_FEATURE_AVX512F,
++	X86_FEATURE_AVX512F,
++	X86_FEATURE_AVX512F,
++	X86_FEATURE_INTEL_PT,
++	X86_FEATURE_PKU,
++};
++
+ /*
+  * Mask of xstate features supported by the CPU and the kernel:
+  */
+@@ -59,26 +73,6 @@ unsigned int fpu_user_xstate_size;
+ void fpu__xstate_clear_all_cpu_caps(void)
+ {
+ 	setup_clear_cpu_cap(X86_FEATURE_XSAVE);
+-	setup_clear_cpu_cap(X86_FEATURE_XSAVEOPT);
+-	setup_clear_cpu_cap(X86_FEATURE_XSAVEC);
+-	setup_clear_cpu_cap(X86_FEATURE_XSAVES);
+-	setup_clear_cpu_cap(X86_FEATURE_AVX);
+-	setup_clear_cpu_cap(X86_FEATURE_AVX2);
+-	setup_clear_cpu_cap(X86_FEATURE_AVX512F);
+-	setup_clear_cpu_cap(X86_FEATURE_AVX512IFMA);
+-	setup_clear_cpu_cap(X86_FEATURE_AVX512PF);
+-	setup_clear_cpu_cap(X86_FEATURE_AVX512ER);
+-	setup_clear_cpu_cap(X86_FEATURE_AVX512CD);
+-	setup_clear_cpu_cap(X86_FEATURE_AVX512DQ);
+-	setup_clear_cpu_cap(X86_FEATURE_AVX512BW);
+-	setup_clear_cpu_cap(X86_FEATURE_AVX512VL);
+-	setup_clear_cpu_cap(X86_FEATURE_MPX);
+-	setup_clear_cpu_cap(X86_FEATURE_XGETBV1);
+-	setup_clear_cpu_cap(X86_FEATURE_AVX512VBMI);
+-	setup_clear_cpu_cap(X86_FEATURE_PKU);
+-	setup_clear_cpu_cap(X86_FEATURE_AVX512_4VNNIW);
+-	setup_clear_cpu_cap(X86_FEATURE_AVX512_4FMAPS);
+-	setup_clear_cpu_cap(X86_FEATURE_AVX512_VPOPCNTDQ);
+ }
+ 
+ /*
+@@ -726,6 +720,7 @@ void __init fpu__init_system_xstate(void)
+ 	unsigned int eax, ebx, ecx, edx;
+ 	static int on_boot_cpu __initdata = 1;
+ 	int err;
++	int i;
+ 
+ 	WARN_ON_FPU(!on_boot_cpu);
+ 	on_boot_cpu = 0;
+@@ -759,6 +754,14 @@ void __init fpu__init_system_xstate(void)
+ 		goto out_disable;
+ 	}
+ 
++	/*
++	 * Clear XSAVE features that are disabled in the normal CPUID.
++	 */
++	for (i = 0; i < ARRAY_SIZE(xsave_cpuid_features); i++) {
++		if (!boot_cpu_has(xsave_cpuid_features[i]))
++			xfeatures_mask &= ~BIT(i);
++	}
++
+ 	xfeatures_mask &= fpu__get_supported_xfeatures_mask();
+ 
+ 	/* Enable xstate instructions to be able to continue with initialization: */
+diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S
+index f1d528bb66a6..c29020907886 100644
+--- a/arch/x86/kernel/head_32.S
++++ b/arch/x86/kernel/head_32.S
+@@ -212,9 +212,6 @@ ENTRY(startup_32_smp)
+ #endif
+ 
+ .Ldefault_entry:
+-#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
+-			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
+-			 X86_CR0_PG)
+ 	movl $(CR0_STATE & ~X86_CR0_PG),%eax
+ 	movl %eax,%cr0
+ 
+@@ -402,7 +399,7 @@ ENTRY(early_idt_handler_array)
+ 	# 24(%rsp) error code
+ 	i = 0
+ 	.rept NUM_EXCEPTION_VECTORS
+-	.ifeq (EXCEPTION_ERRCODE_MASK >> i) & 1
++	.if ((EXCEPTION_ERRCODE_MASK >> i) & 1) == 0
+ 	pushl $0		# Dummy error code, to make stack frame uniform
+ 	.endif
+ 	pushl $i		# 20(%esp) Vector number
+diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
+index 6dde3f3fc1f8..7dca675fe78d 100644
+--- a/arch/x86/kernel/head_64.S
++++ b/arch/x86/kernel/head_64.S
+@@ -38,11 +38,12 @@
+  *
+  */
+ 
+-#define p4d_index(x)	(((x) >> P4D_SHIFT) & (PTRS_PER_P4D-1))
+ #define pud_index(x)	(((x) >> PUD_SHIFT) & (PTRS_PER_PUD-1))
+ 
++#if defined(CONFIG_XEN_PV) || defined(CONFIG_XEN_PVH)
+ PGD_PAGE_OFFSET = pgd_index(__PAGE_OFFSET_BASE)
+ PGD_START_KERNEL = pgd_index(__START_KERNEL_map)
++#endif
+ L3_START_KERNEL = pud_index(__START_KERNEL_map)
+ 
+ 	.text
+@@ -50,6 +51,7 @@ L3_START_KERNEL = pud_index(__START_KERNEL_map)
+ 	.code64
+ 	.globl startup_64
+ startup_64:
++	UNWIND_HINT_EMPTY
+ 	/*
+ 	 * At this point the CPU runs in 64bit mode CS.L = 1 CS.D = 0,
+ 	 * and someone has loaded an identity mapped page table
+@@ -89,6 +91,7 @@ startup_64:
+ 	addq	$(early_top_pgt - __START_KERNEL_map), %rax
+ 	jmp 1f
+ ENTRY(secondary_startup_64)
++	UNWIND_HINT_EMPTY
+ 	/*
+ 	 * At this point the CPU runs in 64bit mode CS.L = 1 CS.D = 0,
+ 	 * and someone has loaded a mapped page table.
+@@ -133,6 +136,7 @@ ENTRY(secondary_startup_64)
+ 	movq	$1f, %rax
+ 	jmp	*%rax
+ 1:
++	UNWIND_HINT_EMPTY
+ 
+ 	/* Check if nx is implemented */
+ 	movl	$0x80000001, %eax
+@@ -150,9 +154,6 @@ ENTRY(secondary_startup_64)
+ 1:	wrmsr				/* Make changes effective */
+ 
+ 	/* Setup cr0 */
+-#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
+-			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
+-			 X86_CR0_PG)
+ 	movl	$CR0_STATE, %eax
+ 	/* Make changes effective */
+ 	movq	%rax, %cr0
+@@ -235,7 +236,7 @@ ENTRY(secondary_startup_64)
+ 	pushq	%rax		# target address in negative space
+ 	lretq
+ .Lafter_lret:
+-ENDPROC(secondary_startup_64)
++END(secondary_startup_64)
+ 
+ #include "verify_cpu.S"
+ 
+@@ -247,6 +248,7 @@ ENDPROC(secondary_startup_64)
+  */
+ ENTRY(start_cpu0)
+ 	movq	initial_stack(%rip), %rsp
++	UNWIND_HINT_EMPTY
+ 	jmp	.Ljump_to_C_code
+ ENDPROC(start_cpu0)
+ #endif
+@@ -266,26 +268,24 @@ ENDPROC(start_cpu0)
+ 	.quad  init_thread_union + THREAD_SIZE - SIZEOF_PTREGS
+ 	__FINITDATA
+ 
+-bad_address:
+-	jmp bad_address
+-
+ 	__INIT
+ ENTRY(early_idt_handler_array)
+-	# 104(%rsp) %rflags
+-	#  96(%rsp) %cs
+-	#  88(%rsp) %rip
+-	#  80(%rsp) error code
+ 	i = 0
+ 	.rept NUM_EXCEPTION_VECTORS
+-	.ifeq (EXCEPTION_ERRCODE_MASK >> i) & 1
+-	pushq $0		# Dummy error code, to make stack frame uniform
++	.if ((EXCEPTION_ERRCODE_MASK >> i) & 1) == 0
++		UNWIND_HINT_IRET_REGS
++		pushq $0	# Dummy error code, to make stack frame uniform
++	.else
++		UNWIND_HINT_IRET_REGS offset=8
+ 	.endif
+ 	pushq $i		# 72(%rsp) Vector number
+ 	jmp early_idt_handler_common
++	UNWIND_HINT_IRET_REGS
+ 	i = i + 1
+ 	.fill early_idt_handler_array + i*EARLY_IDT_HANDLER_SIZE - ., 1, 0xcc
+ 	.endr
+-ENDPROC(early_idt_handler_array)
++	UNWIND_HINT_IRET_REGS offset=16
++END(early_idt_handler_array)
+ 
+ early_idt_handler_common:
+ 	/*
+@@ -313,6 +313,7 @@ early_idt_handler_common:
+ 	pushq %r13				/* pt_regs->r13 */
+ 	pushq %r14				/* pt_regs->r14 */
+ 	pushq %r15				/* pt_regs->r15 */
++	UNWIND_HINT_REGS
+ 
+ 	cmpq $14,%rsi		/* Page fault? */
+ 	jnz 10f
+@@ -327,8 +328,8 @@ early_idt_handler_common:
+ 
+ 20:
+ 	decl early_recursion_flag(%rip)
+-	jmp restore_regs_and_iret
+-ENDPROC(early_idt_handler_common)
++	jmp restore_regs_and_return_to_kernel
++END(early_idt_handler_common)
+ 
+ 	__INITDATA
+ 
+@@ -362,10 +363,7 @@ NEXT_PAGE(early_dynamic_pgts)
+ 
+ 	.data
+ 
+-#ifndef CONFIG_XEN
+-NEXT_PAGE(init_top_pgt)
+-	.fill	512,8,0
+-#else
++#if defined(CONFIG_XEN_PV) || defined(CONFIG_XEN_PVH)
+ NEXT_PAGE(init_top_pgt)
+ 	.quad   level3_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE_NOENC
+ 	.org    init_top_pgt + PGD_PAGE_OFFSET*8, 0
+@@ -382,6 +380,9 @@ NEXT_PAGE(level2_ident_pgt)
+ 	 * Don't set NX because code runs from these pages.
+ 	 */
+ 	PMDS(0, __PAGE_KERNEL_IDENT_LARGE_EXEC, PTRS_PER_PMD)
++#else
++NEXT_PAGE(init_top_pgt)
++	.fill	512,8,0
+ #endif
+ 
+ #ifdef CONFIG_X86_5LEVEL
+@@ -435,7 +436,7 @@ ENTRY(phys_base)
+ EXPORT_SYMBOL(phys_base)
+ 
+ #include "../../x86/xen/xen-head.S"
+-	
++
+ 	__PAGE_ALIGNED_BSS
+ NEXT_PAGE(empty_zero_page)
+ 	.skip PAGE_SIZE
+diff --git a/arch/x86/kernel/ioport.c b/arch/x86/kernel/ioport.c
+index 3feb648781c4..2f723301eb58 100644
+--- a/arch/x86/kernel/ioport.c
++++ b/arch/x86/kernel/ioport.c
+@@ -67,7 +67,7 @@ asmlinkage long sys_ioperm(unsigned long from, unsigned long num, int turn_on)
+ 	 * because the ->io_bitmap_max value must match the bitmap
+ 	 * contents:
+ 	 */
+-	tss = &per_cpu(cpu_tss, get_cpu());
++	tss = &per_cpu(cpu_tss_rw, get_cpu());
+ 
+ 	if (turn_on)
+ 		bitmap_clear(t->io_bitmap_ptr, from, num);
+diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
+index 52089c043160..aa9d51eea9d0 100644
+--- a/arch/x86/kernel/irq.c
++++ b/arch/x86/kernel/irq.c
+@@ -219,18 +219,6 @@ __visible unsigned int __irq_entry do_IRQ(struct pt_regs *regs)
+ 	/* high bit used in ret_from_ code  */
+ 	unsigned vector = ~regs->orig_ax;
+ 
+-	/*
+-	 * NB: Unlike exception entries, IRQ entries do not reliably
+-	 * handle context tracking in the low-level entry code.  This is
+-	 * because syscall entries execute briefly with IRQs on before
+-	 * updating context tracking state, so we can take an IRQ from
+-	 * kernel mode with CONTEXT_USER.  The low-level entry code only
+-	 * updates the context if we came from user mode, so we won't
+-	 * switch to CONTEXT_KERNEL.  We'll fix that once the syscall
+-	 * code is cleaned up enough that we can cleanly defer enabling
+-	 * IRQs.
+-	 */
+-
+ 	entering_irq();
+ 
+ 	/* entering_irq() tells RCU that we're not quiescent.  Check it. */
+diff --git a/arch/x86/kernel/irq_64.c b/arch/x86/kernel/irq_64.c
+index 020efbf5786b..d86e344f5b3d 100644
+--- a/arch/x86/kernel/irq_64.c
++++ b/arch/x86/kernel/irq_64.c
+@@ -57,10 +57,10 @@ static inline void stack_overflow_check(struct pt_regs *regs)
+ 	if (regs->sp >= estack_top && regs->sp <= estack_bottom)
+ 		return;
+ 
+-	WARN_ONCE(1, "do_IRQ(): %s has overflown the kernel stack (cur:%Lx,sp:%lx,irq stk top-bottom:%Lx-%Lx,exception stk top-bottom:%Lx-%Lx)\n",
++	WARN_ONCE(1, "do_IRQ(): %s has overflown the kernel stack (cur:%Lx,sp:%lx,irq stk top-bottom:%Lx-%Lx,exception stk top-bottom:%Lx-%Lx,ip:%pF)\n",
+ 		current->comm, curbase, regs->sp,
+ 		irq_stack_top, irq_stack_bottom,
+-		estack_top, estack_bottom);
++		estack_top, estack_bottom, (void *)regs->ip);
+ 
+ 	if (sysctl_panic_on_stackoverflow)
+ 		panic("low stack detected by irq handler - check messages\n");
+diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
+index 8bb9594d0761..a94de09edbed 100644
+--- a/arch/x86/kernel/kvm.c
++++ b/arch/x86/kernel/kvm.c
+@@ -544,12 +544,12 @@ static uint32_t __init kvm_detect(void)
+ 	return kvm_cpuid_base();
+ }
+ 
+-const struct hypervisor_x86 x86_hyper_kvm __refconst = {
++const __initconst struct hypervisor_x86 x86_hyper_kvm = {
+ 	.name			= "KVM",
+ 	.detect			= kvm_detect,
+-	.x2apic_available	= kvm_para_available,
++	.type			= X86_HYPER_KVM,
++	.init.x2apic_available	= kvm_para_available,
+ };
+-EXPORT_SYMBOL_GPL(x86_hyper_kvm);
+ 
+ static __init int activate_jump_labels(void)
+ {
+diff --git a/arch/x86/kernel/ldt.c b/arch/x86/kernel/ldt.c
+index ae5615b03def..1c1eae961340 100644
+--- a/arch/x86/kernel/ldt.c
++++ b/arch/x86/kernel/ldt.c
+@@ -103,7 +103,7 @@ static void finalize_ldt_struct(struct ldt_struct *ldt)
+ static void install_ldt(struct mm_struct *current_mm,
+ 			struct ldt_struct *ldt)
+ {
+-	/* Synchronizes with lockless_dereference in load_mm_ldt. */
++	/* Synchronizes with READ_ONCE in load_mm_ldt. */
+ 	smp_store_release(&current_mm->context.ldt, ldt);
+ 
+ 	/* Activate the LDT for all CPUs using current_mm. */
+diff --git a/arch/x86/kernel/paravirt_patch_64.c b/arch/x86/kernel/paravirt_patch_64.c
+index ac0be8283325..9edadabf04f6 100644
+--- a/arch/x86/kernel/paravirt_patch_64.c
++++ b/arch/x86/kernel/paravirt_patch_64.c
+@@ -10,7 +10,6 @@ DEF_NATIVE(pv_irq_ops, save_fl, "pushfq; popq %rax");
+ DEF_NATIVE(pv_mmu_ops, read_cr2, "movq %cr2, %rax");
+ DEF_NATIVE(pv_mmu_ops, read_cr3, "movq %cr3, %rax");
+ DEF_NATIVE(pv_mmu_ops, write_cr3, "movq %rdi, %cr3");
+-DEF_NATIVE(pv_mmu_ops, flush_tlb_single, "invlpg (%rdi)");
+ DEF_NATIVE(pv_cpu_ops, wbinvd, "wbinvd");
+ 
+ DEF_NATIVE(pv_cpu_ops, usergs_sysret64, "swapgs; sysretq");
+@@ -60,7 +59,6 @@ unsigned native_patch(u8 type, u16 clobbers, void *ibuf,
+ 		PATCH_SITE(pv_mmu_ops, read_cr2);
+ 		PATCH_SITE(pv_mmu_ops, read_cr3);
+ 		PATCH_SITE(pv_mmu_ops, write_cr3);
+-		PATCH_SITE(pv_mmu_ops, flush_tlb_single);
+ 		PATCH_SITE(pv_cpu_ops, wbinvd);
+ #if defined(CONFIG_PARAVIRT_SPINLOCKS)
+ 		case PARAVIRT_PATCH(pv_lock_ops.queued_spin_unlock):
+diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
+index c67685337c5a..517415978409 100644
+--- a/arch/x86/kernel/process.c
++++ b/arch/x86/kernel/process.c
+@@ -47,9 +47,25 @@
+  * section. Since TSS's are completely CPU-local, we want them
+  * on exact cacheline boundaries, to eliminate cacheline ping-pong.
+  */
+-__visible DEFINE_PER_CPU_SHARED_ALIGNED(struct tss_struct, cpu_tss) = {
++__visible DEFINE_PER_CPU_SHARED_ALIGNED(struct tss_struct, cpu_tss_rw) = {
+ 	.x86_tss = {
+-		.sp0 = TOP_OF_INIT_STACK,
++		/*
++		 * .sp0 is only used when entering ring 0 from a lower
++		 * privilege level.  Since the init task never runs anything
++		 * but ring 0 code, there is no need for a valid value here.
++		 * Poison it.
++		 */
++		.sp0 = (1UL << (BITS_PER_LONG-1)) + 1,
++
++#ifdef CONFIG_X86_64
++		/*
++		 * .sp1 is cpu_current_top_of_stack.  The init task never
++		 * runs user code, but cpu_current_top_of_stack should still
++		 * be well defined before the first context switch.
++		 */
++		.sp1 = TOP_OF_INIT_STACK,
++#endif
++
+ #ifdef CONFIG_X86_32
+ 		.ss0 = __KERNEL_DS,
+ 		.ss1 = __KERNEL_CS,
+@@ -65,11 +81,8 @@ __visible DEFINE_PER_CPU_SHARED_ALIGNED(struct tss_struct, cpu_tss) = {
+ 	  */
+ 	.io_bitmap		= { [0 ... IO_BITMAP_LONGS] = ~0 },
+ #endif
+-#ifdef CONFIG_X86_32
+-	.SYSENTER_stack_canary	= STACK_END_MAGIC,
+-#endif
+ };
+-EXPORT_PER_CPU_SYMBOL(cpu_tss);
++EXPORT_PER_CPU_SYMBOL(cpu_tss_rw);
+ 
+ DEFINE_PER_CPU(bool, __tss_limit_invalid);
+ EXPORT_PER_CPU_SYMBOL_GPL(__tss_limit_invalid);
+@@ -98,7 +111,7 @@ void exit_thread(struct task_struct *tsk)
+ 	struct fpu *fpu = &t->fpu;
+ 
+ 	if (bp) {
+-		struct tss_struct *tss = &per_cpu(cpu_tss, get_cpu());
++		struct tss_struct *tss = &per_cpu(cpu_tss_rw, get_cpu());
+ 
+ 		t->io_bitmap_ptr = NULL;
+ 		clear_thread_flag(TIF_IO_BITMAP);
+diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
+index 11966251cd42..5224c6099184 100644
+--- a/arch/x86/kernel/process_32.c
++++ b/arch/x86/kernel/process_32.c
+@@ -234,7 +234,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
+ 	struct fpu *prev_fpu = &prev->fpu;
+ 	struct fpu *next_fpu = &next->fpu;
+ 	int cpu = smp_processor_id();
+-	struct tss_struct *tss = &per_cpu(cpu_tss, cpu);
++	struct tss_struct *tss = &per_cpu(cpu_tss_rw, cpu);
+ 
+ 	/* never put a printk in __switch_to... printk() calls wake_up*() indirectly */
+ 
+@@ -284,9 +284,11 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
+ 
+ 	/*
+ 	 * Reload esp0 and cpu_current_top_of_stack.  This changes
+-	 * current_thread_info().
++	 * current_thread_info().  Refresh the SYSENTER configuration in
++	 * case prev or next is vm86.
+ 	 */
+-	load_sp0(tss, next);
++	update_sp0(next_p);
++	refresh_sysenter_cs(next);
+ 	this_cpu_write(cpu_current_top_of_stack,
+ 		       (unsigned long)task_stack_page(next_p) +
+ 		       THREAD_SIZE);
+diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
+index 302e7b2572d1..c75466232016 100644
+--- a/arch/x86/kernel/process_64.c
++++ b/arch/x86/kernel/process_64.c
+@@ -69,9 +69,8 @@ void __show_regs(struct pt_regs *regs, int all)
+ 	unsigned int fsindex, gsindex;
+ 	unsigned int ds, cs, es;
+ 
+-	printk(KERN_DEFAULT "RIP: %04lx:%pS\n", regs->cs, (void *)regs->ip);
+-	printk(KERN_DEFAULT "RSP: %04lx:%016lx EFLAGS: %08lx", regs->ss,
+-		regs->sp, regs->flags);
++	show_iret_regs(regs);
++
+ 	if (regs->orig_ax != -1)
+ 		pr_cont(" ORIG_RAX: %016lx\n", regs->orig_ax);
+ 	else
+@@ -88,6 +87,9 @@ void __show_regs(struct pt_regs *regs, int all)
+ 	printk(KERN_DEFAULT "R13: %016lx R14: %016lx R15: %016lx\n",
+ 	       regs->r13, regs->r14, regs->r15);
+ 
++	if (!all)
++		return;
++
+ 	asm("movl %%ds,%0" : "=r" (ds));
+ 	asm("movl %%cs,%0" : "=r" (cs));
+ 	asm("movl %%es,%0" : "=r" (es));
+@@ -98,9 +100,6 @@ void __show_regs(struct pt_regs *regs, int all)
+ 	rdmsrl(MSR_GS_BASE, gs);
+ 	rdmsrl(MSR_KERNEL_GS_BASE, shadowgs);
+ 
+-	if (!all)
+-		return;
+-
+ 	cr0 = read_cr0();
+ 	cr2 = read_cr2();
+ 	cr3 = __read_cr3();
+@@ -274,7 +273,6 @@ int copy_thread_tls(unsigned long clone_flags, unsigned long sp,
+ 	struct inactive_task_frame *frame;
+ 	struct task_struct *me = current;
+ 
+-	p->thread.sp0 = (unsigned long)task_stack_page(p) + THREAD_SIZE;
+ 	childregs = task_pt_regs(p);
+ 	fork_frame = container_of(childregs, struct fork_frame, regs);
+ 	frame = &fork_frame->frame;
+@@ -401,7 +399,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
+ 	struct fpu *prev_fpu = &prev->fpu;
+ 	struct fpu *next_fpu = &next->fpu;
+ 	int cpu = smp_processor_id();
+-	struct tss_struct *tss = &per_cpu(cpu_tss, cpu);
++	struct tss_struct *tss = &per_cpu(cpu_tss_rw, cpu);
+ 
+ 	WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ENTRY) &&
+ 		     this_cpu_read(irq_count) != -1);
+@@ -463,9 +461,10 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
+ 	 * Switch the PDA and FPU contexts.
+ 	 */
+ 	this_cpu_write(current_task, next_p);
++	this_cpu_write(cpu_current_top_of_stack, task_top_of_stack(next_p));
+ 
+-	/* Reload esp0 and ss1.  This changes current_thread_info(). */
+-	load_sp0(tss, next);
++	/* Reload sp0. */
++	update_sp0(next_p);
+ 
+ 	/*
+ 	 * Now maybe reload the debug registers and handle I/O bitmaps
+diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
+index 5e0453f18a57..142126ab5aae 100644
+--- a/arch/x86/kernel/smpboot.c
++++ b/arch/x86/kernel/smpboot.c
+@@ -962,8 +962,7 @@ void common_cpu_up(unsigned int cpu, struct task_struct *idle)
+ #ifdef CONFIG_X86_32
+ 	/* Stack for startup_32 can be just as for start_secondary onwards */
+ 	irq_ctx_init(cpu);
+-	per_cpu(cpu_current_top_of_stack, cpu) =
+-		(unsigned long)task_stack_page(idle) + THREAD_SIZE;
++	per_cpu(cpu_current_top_of_stack, cpu) = task_top_of_stack(idle);
+ #else
+ 	initial_gs = per_cpu_offset(cpu);
+ #endif
+diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
+index 5a6b8f809792..74136fd16f49 100644
+--- a/arch/x86/kernel/traps.c
++++ b/arch/x86/kernel/traps.c
+@@ -141,8 +141,7 @@ void ist_begin_non_atomic(struct pt_regs *regs)
+ 	 * will catch asm bugs and any attempt to use ist_preempt_enable
+ 	 * from double_fault.
+ 	 */
+-	BUG_ON((unsigned long)(current_top_of_stack() -
+-			       current_stack_pointer) >= THREAD_SIZE);
++	BUG_ON(!on_thread_stack());
+ 
+ 	preempt_enable_no_resched();
+ }
+@@ -349,9 +348,15 @@ dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code)
+ 
+ 	/*
+ 	 * If IRET takes a non-IST fault on the espfix64 stack, then we
+-	 * end up promoting it to a doublefault.  In that case, modify
+-	 * the stack to make it look like we just entered the #GP
+-	 * handler from user space, similar to bad_iret.
++	 * end up promoting it to a doublefault.  In that case, take
++	 * advantage of the fact that we're not using the normal (TSS.sp0)
++	 * stack right now.  We can write a fake #GP(0) frame at TSS.sp0
++	 * and then modify our own IRET frame so that, when we return,
++	 * we land directly at the #GP(0) vector with the stack already
++	 * set up according to its expectations.
++	 *
++	 * The net result is that our #GP handler will think that we
++	 * entered from usermode with the bad user context.
+ 	 *
+ 	 * No need for ist_enter here because we don't use RCU.
+ 	 */
+@@ -359,13 +364,26 @@ dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code)
+ 		regs->cs == __KERNEL_CS &&
+ 		regs->ip == (unsigned long)native_irq_return_iret)
+ 	{
+-		struct pt_regs *normal_regs = task_pt_regs(current);
++		struct pt_regs *gpregs = (struct pt_regs *)this_cpu_read(cpu_tss_rw.x86_tss.sp0) - 1;
+ 
+-		/* Fake a #GP(0) from userspace. */
+-		memmove(&normal_regs->ip, (void *)regs->sp, 5*8);
+-		normal_regs->orig_ax = 0;  /* Missing (lost) #GP error code */
++		/*
++		 * regs->sp points to the failing IRET frame on the
++		 * ESPFIX64 stack.  Copy it to the entry stack.  This fills
++		 * in gpregs->ss through gpregs->ip.
++		 *
++		 */
++		memmove(&gpregs->ip, (void *)regs->sp, 5*8);
++		gpregs->orig_ax = 0;  /* Missing (lost) #GP error code */
++
++		/*
++		 * Adjust our frame so that we return straight to the #GP
++		 * vector with the expected RSP value.  This is safe because
++		 * we won't enable interupts or schedule before we invoke
++		 * general_protection, so nothing will clobber the stack
++		 * frame we just set up.
++		 */
+ 		regs->ip = (unsigned long)general_protection;
+-		regs->sp = (unsigned long)&normal_regs->orig_ax;
++		regs->sp = (unsigned long)&gpregs->orig_ax;
+ 
+ 		return;
+ 	}
+@@ -390,7 +408,7 @@ dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code)
+ 	 *
+ 	 *   Processors update CR2 whenever a page fault is detected. If a
+ 	 *   second page fault occurs while an earlier page fault is being
+-	 *   deliv- ered, the faulting linear address of the second fault will
++	 *   delivered, the faulting linear address of the second fault will
+ 	 *   overwrite the contents of CR2 (replacing the previous
+ 	 *   address). These updates to CR2 occur even if the page fault
+ 	 *   results in a double fault or occurs during the delivery of a
+@@ -601,14 +619,15 @@ NOKPROBE_SYMBOL(do_int3);
+ 
+ #ifdef CONFIG_X86_64
+ /*
+- * Help handler running on IST stack to switch off the IST stack if the
+- * interrupted code was in user mode. The actual stack switch is done in
+- * entry_64.S
++ * Help handler running on a per-cpu (IST or entry trampoline) stack
++ * to switch to the normal thread stack if the interrupted code was in
++ * user mode. The actual stack switch is done in entry_64.S
+  */
+ asmlinkage __visible notrace struct pt_regs *sync_regs(struct pt_regs *eregs)
+ {
+-	struct pt_regs *regs = task_pt_regs(current);
+-	*regs = *eregs;
++	struct pt_regs *regs = (struct pt_regs *)this_cpu_read(cpu_current_top_of_stack) - 1;
++	if (regs != eregs)
++		*regs = *eregs;
+ 	return regs;
+ }
+ NOKPROBE_SYMBOL(sync_regs);
+@@ -624,13 +643,13 @@ struct bad_iret_stack *fixup_bad_iret(struct bad_iret_stack *s)
+ 	/*
+ 	 * This is called from entry_64.S early in handling a fault
+ 	 * caused by a bad iret to user mode.  To handle the fault
+-	 * correctly, we want move our stack frame to task_pt_regs
+-	 * and we want to pretend that the exception came from the
+-	 * iret target.
++	 * correctly, we want to move our stack frame to where it would
++	 * be had we entered directly on the entry stack (rather than
++	 * just below the IRET frame) and we want to pretend that the
++	 * exception came from the IRET target.
+ 	 */
+ 	struct bad_iret_stack *new_stack =
+-		container_of(task_pt_regs(current),
+-			     struct bad_iret_stack, regs);
++		(struct bad_iret_stack *)this_cpu_read(cpu_tss_rw.x86_tss.sp0) - 1;
+ 
+ 	/* Copy the IRET target to the new stack. */
+ 	memmove(&new_stack->regs.ip, (void *)s->regs.sp, 5*8);
+@@ -795,14 +814,6 @@ dotraplinkage void do_debug(struct pt_regs *regs, long error_code)
+ 	debug_stack_usage_dec();
+ 
+ exit:
+-#if defined(CONFIG_X86_32)
+-	/*
+-	 * This is the most likely code path that involves non-trivial use
+-	 * of the SYSENTER stack.  Check that we haven't overrun it.
+-	 */
+-	WARN(this_cpu_read(cpu_tss.SYSENTER_stack_canary) != STACK_END_MAGIC,
+-	     "Overran or corrupted SYSENTER stack\n");
+-#endif
+ 	ist_exit(regs);
+ }
+ NOKPROBE_SYMBOL(do_debug);
+@@ -929,6 +940,9 @@ dotraplinkage void do_iret_error(struct pt_regs *regs, long error_code)
+ 
+ void __init trap_init(void)
+ {
++	/* Init cpu_entry_area before IST entries are set up */
++	setup_cpu_entry_areas();
++
+ 	idt_setup_traps();
+ 
+ 	/*
+diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
+index a3f973b2c97a..be86a865087a 100644
+--- a/arch/x86/kernel/unwind_orc.c
++++ b/arch/x86/kernel/unwind_orc.c
+@@ -253,22 +253,15 @@ unsigned long *unwind_get_return_address_ptr(struct unwind_state *state)
+ 	return NULL;
+ }
+ 
+-static bool stack_access_ok(struct unwind_state *state, unsigned long addr,
++static bool stack_access_ok(struct unwind_state *state, unsigned long _addr,
+ 			    size_t len)
+ {
+ 	struct stack_info *info = &state->stack_info;
++	void *addr = (void *)_addr;
+ 
+-	/*
+-	 * If the address isn't on the current stack, switch to the next one.
+-	 *
+-	 * We may have to traverse multiple stacks to deal with the possibility
+-	 * that info->next_sp could point to an empty stack and the address
+-	 * could be on a subsequent stack.
+-	 */
+-	while (!on_stack(info, (void *)addr, len))
+-		if (get_stack_info(info->next_sp, state->task, info,
+-				   &state->stack_mask))
+-			return false;
++	if (!on_stack(info, addr, len) &&
++	    (get_stack_info(addr, state->task, info, &state->stack_mask)))
++		return false;
+ 
+ 	return true;
+ }
+@@ -283,42 +276,32 @@ static bool deref_stack_reg(struct unwind_state *state, unsigned long addr,
+ 	return true;
+ }
+ 
+-#define REGS_SIZE (sizeof(struct pt_regs))
+-#define SP_OFFSET (offsetof(struct pt_regs, sp))
+-#define IRET_REGS_SIZE (REGS_SIZE - offsetof(struct pt_regs, ip))
+-#define IRET_SP_OFFSET (SP_OFFSET - offsetof(struct pt_regs, ip))
+-
+ static bool deref_stack_regs(struct unwind_state *state, unsigned long addr,
+-			     unsigned long *ip, unsigned long *sp, bool full)
++			     unsigned long *ip, unsigned long *sp)
+ {
+-	size_t regs_size = full ? REGS_SIZE : IRET_REGS_SIZE;
+-	size_t sp_offset = full ? SP_OFFSET : IRET_SP_OFFSET;
+-	struct pt_regs *regs = (struct pt_regs *)(addr + regs_size - REGS_SIZE);
+-
+-	if (IS_ENABLED(CONFIG_X86_64)) {
+-		if (!stack_access_ok(state, addr, regs_size))
+-			return false;
++	struct pt_regs *regs = (struct pt_regs *)addr;
+ 
+-		*ip = regs->ip;
+-		*sp = regs->sp;
++	/* x86-32 support will be more complicated due to the &regs->sp hack */
++	BUILD_BUG_ON(IS_ENABLED(CONFIG_X86_32));
+ 
+-		return true;
+-	}
+-
+-	if (!stack_access_ok(state, addr, sp_offset))
++	if (!stack_access_ok(state, addr, sizeof(struct pt_regs)))
+ 		return false;
+ 
+ 	*ip = regs->ip;
++	*sp = regs->sp;
++	return true;
++}
+ 
+-	if (user_mode(regs)) {
+-		if (!stack_access_ok(state, addr + sp_offset,
+-				     REGS_SIZE - SP_OFFSET))
+-			return false;
++static bool deref_stack_iret_regs(struct unwind_state *state, unsigned long addr,
++				  unsigned long *ip, unsigned long *sp)
++{
++	struct pt_regs *regs = (void *)addr - IRET_FRAME_OFFSET;
+ 
+-		*sp = regs->sp;
+-	} else
+-		*sp = (unsigned long)&regs->sp;
++	if (!stack_access_ok(state, addr, IRET_FRAME_SIZE))
++		return false;
+ 
++	*ip = regs->ip;
++	*sp = regs->sp;
+ 	return true;
+ }
+ 
+@@ -327,7 +310,6 @@ bool unwind_next_frame(struct unwind_state *state)
+ 	unsigned long ip_p, sp, orig_ip, prev_sp = state->sp;
+ 	enum stack_type prev_type = state->stack_info.type;
+ 	struct orc_entry *orc;
+-	struct pt_regs *ptregs;
+ 	bool indirect = false;
+ 
+ 	if (unwind_done(state))
+@@ -435,7 +417,7 @@ bool unwind_next_frame(struct unwind_state *state)
+ 		break;
+ 
+ 	case ORC_TYPE_REGS:
+-		if (!deref_stack_regs(state, sp, &state->ip, &state->sp, true)) {
++		if (!deref_stack_regs(state, sp, &state->ip, &state->sp)) {
+ 			orc_warn("can't dereference registers at %p for ip %pB\n",
+ 				 (void *)sp, (void *)orig_ip);
+ 			goto done;
+@@ -447,20 +429,14 @@ bool unwind_next_frame(struct unwind_state *state)
+ 		break;
+ 
+ 	case ORC_TYPE_REGS_IRET:
+-		if (!deref_stack_regs(state, sp, &state->ip, &state->sp, false)) {
++		if (!deref_stack_iret_regs(state, sp, &state->ip, &state->sp)) {
+ 			orc_warn("can't dereference iret registers at %p for ip %pB\n",
+ 				 (void *)sp, (void *)orig_ip);
+ 			goto done;
+ 		}
+ 
+-		ptregs = container_of((void *)sp, struct pt_regs, ip);
+-		if ((unsigned long)ptregs >= prev_sp &&
+-		    on_stack(&state->stack_info, ptregs, REGS_SIZE)) {
+-			state->regs = ptregs;
+-			state->full_regs = false;
+-		} else
+-			state->regs = NULL;
+-
++		state->regs = (void *)sp - IRET_FRAME_OFFSET;
++		state->full_regs = false;
+ 		state->signal = true;
+ 		break;
+ 
+@@ -553,8 +529,18 @@ void __unwind_start(struct unwind_state *state, struct task_struct *task,
+ 	}
+ 
+ 	if (get_stack_info((unsigned long *)state->sp, state->task,
+-			   &state->stack_info, &state->stack_mask))
+-		return;
++			   &state->stack_info, &state->stack_mask)) {
++		/*
++		 * We weren't on a valid stack.  It's possible that
++		 * we overflowed a valid stack into a guard page.
++		 * See if the next page up is valid so that we can
++		 * generate some kind of backtrace if this happens.
++		 */
++		void *next_page = (void *)PAGE_ALIGN((unsigned long)state->sp);
++		if (get_stack_info(next_page, state->task, &state->stack_info,
++				   &state->stack_mask))
++			return;
++	}
+ 
+ 	/*
+ 	 * The caller can provide the address of the first frame directly
+diff --git a/arch/x86/kernel/verify_cpu.S b/arch/x86/kernel/verify_cpu.S
+index 014ea59aa153..3d3c2f71f617 100644
+--- a/arch/x86/kernel/verify_cpu.S
++++ b/arch/x86/kernel/verify_cpu.S
+@@ -33,7 +33,7 @@
+ #include <asm/cpufeatures.h>
+ #include <asm/msr-index.h>
+ 
+-verify_cpu:
++ENTRY(verify_cpu)
+ 	pushf				# Save caller passed flags
+ 	push	$0			# Kill any dangerous flags
+ 	popf
+@@ -139,3 +139,4 @@ verify_cpu:
+ 	popf				# Restore caller passed flags
+ 	xorl %eax, %eax
+ 	ret
++ENDPROC(verify_cpu)
+diff --git a/arch/x86/kernel/vm86_32.c b/arch/x86/kernel/vm86_32.c
+index 68244742ecb0..5edb27f1a2c4 100644
+--- a/arch/x86/kernel/vm86_32.c
++++ b/arch/x86/kernel/vm86_32.c
+@@ -55,6 +55,7 @@
+ #include <asm/irq.h>
+ #include <asm/traps.h>
+ #include <asm/vm86.h>
++#include <asm/switch_to.h>
+ 
+ /*
+  * Known problems:
+@@ -94,7 +95,6 @@
+ 
+ void save_v86_state(struct kernel_vm86_regs *regs, int retval)
+ {
+-	struct tss_struct *tss;
+ 	struct task_struct *tsk = current;
+ 	struct vm86plus_struct __user *user;
+ 	struct vm86 *vm86 = current->thread.vm86;
+@@ -146,12 +146,13 @@ void save_v86_state(struct kernel_vm86_regs *regs, int retval)
+ 		do_exit(SIGSEGV);
+ 	}
+ 
+-	tss = &per_cpu(cpu_tss, get_cpu());
++	preempt_disable();
+ 	tsk->thread.sp0 = vm86->saved_sp0;
+ 	tsk->thread.sysenter_cs = __KERNEL_CS;
+-	load_sp0(tss, &tsk->thread);
++	update_sp0(tsk);
++	refresh_sysenter_cs(&tsk->thread);
+ 	vm86->saved_sp0 = 0;
+-	put_cpu();
++	preempt_enable();
+ 
+ 	memcpy(&regs->pt, &vm86->regs32, sizeof(struct pt_regs));
+ 
+@@ -237,7 +238,6 @@ SYSCALL_DEFINE2(vm86, unsigned long, cmd, unsigned long, arg)
+ 
+ static long do_sys_vm86(struct vm86plus_struct __user *user_vm86, bool plus)
+ {
+-	struct tss_struct *tss;
+ 	struct task_struct *tsk = current;
+ 	struct vm86 *vm86 = tsk->thread.vm86;
+ 	struct kernel_vm86_regs vm86regs;
+@@ -365,15 +365,17 @@ static long do_sys_vm86(struct vm86plus_struct __user *user_vm86, bool plus)
+ 	vm86->saved_sp0 = tsk->thread.sp0;
+ 	lazy_save_gs(vm86->regs32.gs);
+ 
+-	tss = &per_cpu(cpu_tss, get_cpu());
+ 	/* make room for real-mode segments */
++	preempt_disable();
+ 	tsk->thread.sp0 += 16;
+ 
+-	if (static_cpu_has(X86_FEATURE_SEP))
++	if (static_cpu_has(X86_FEATURE_SEP)) {
+ 		tsk->thread.sysenter_cs = 0;
++		refresh_sysenter_cs(&tsk->thread);
++	}
+ 
+-	load_sp0(tss, &tsk->thread);
+-	put_cpu();
++	update_sp0(tsk);
++	preempt_enable();
+ 
+ 	if (vm86->flags & VM86_SCREEN_BITMAP)
+ 		mark_screen_rdonly(tsk->mm);
+diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
+index a4009fb9be87..d2a8b5a24a44 100644
+--- a/arch/x86/kernel/vmlinux.lds.S
++++ b/arch/x86/kernel/vmlinux.lds.S
+@@ -107,6 +107,15 @@ SECTIONS
+ 		SOFTIRQENTRY_TEXT
+ 		*(.fixup)
+ 		*(.gnu.warning)
++
++#ifdef CONFIG_X86_64
++		. = ALIGN(PAGE_SIZE);
++		_entry_trampoline = .;
++		*(.entry_trampoline)
++		. = ALIGN(PAGE_SIZE);
++		ASSERT(. - _entry_trampoline == PAGE_SIZE, "entry trampoline is too big");
++#endif
++
+ 		/* End of text section */
+ 		_etext = .;
+ 	} :text = 0x9090
+diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
+index a088b2c47f73..5b2d10c1973a 100644
+--- a/arch/x86/kernel/x86_init.c
++++ b/arch/x86/kernel/x86_init.c
+@@ -28,6 +28,8 @@ void x86_init_noop(void) { }
+ void __init x86_init_uint_noop(unsigned int unused) { }
+ int __init iommu_init_noop(void) { return 0; }
+ void iommu_shutdown_noop(void) { }
++bool __init bool_x86_init_noop(void) { return false; }
++void x86_op_int_noop(int cpu) { }
+ 
+ /*
+  * The platform setup functions are preset with the default functions
+@@ -81,6 +83,12 @@ struct x86_init_ops x86_init __initdata = {
+ 		.init_irq		= x86_default_pci_init_irq,
+ 		.fixup_irqs		= x86_default_pci_fixup_irqs,
+ 	},
++
++	.hyper = {
++		.init_platform		= x86_init_noop,
++		.x2apic_available	= bool_x86_init_noop,
++		.init_mem_mapping	= x86_init_noop,
++	},
+ };
+ 
+ struct x86_cpuinit_ops x86_cpuinit = {
+@@ -101,6 +109,7 @@ struct x86_platform_ops x86_platform __ro_after_init = {
+ 	.get_nmi_reason			= default_get_nmi_reason,
+ 	.save_sched_clock_state 	= tsc_save_sched_clock_state,
+ 	.restore_sched_clock_state 	= tsc_restore_sched_clock_state,
++	.hyper.pin_vcpu			= x86_op_int_noop,
+ };
+ 
+ EXPORT_SYMBOL_GPL(x86_platform);
+diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
+index 7a69cf053711..13ebeedcec07 100644
+--- a/arch/x86/kvm/mmu.c
++++ b/arch/x86/kvm/mmu.c
+@@ -5476,13 +5476,13 @@ int kvm_mmu_module_init(void)
+ 
+ 	pte_list_desc_cache = kmem_cache_create("pte_list_desc",
+ 					    sizeof(struct pte_list_desc),
+-					    0, 0, NULL);
++					    0, SLAB_ACCOUNT, NULL);
+ 	if (!pte_list_desc_cache)
+ 		goto nomem;
+ 
+ 	mmu_page_header_cache = kmem_cache_create("kvm_mmu_page_header",
+ 						  sizeof(struct kvm_mmu_page),
+-						  0, 0, NULL);
++						  0, SLAB_ACCOUNT, NULL);
+ 	if (!mmu_page_header_cache)
+ 		goto nomem;
+ 
+diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
+index bc5921c1e2f2..47d9432756f3 100644
+--- a/arch/x86/kvm/vmx.c
++++ b/arch/x86/kvm/vmx.c
+@@ -2295,7 +2295,7 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
+ 		 * processors.  See 22.2.4.
+ 		 */
+ 		vmcs_writel(HOST_TR_BASE,
+-			    (unsigned long)this_cpu_ptr(&cpu_tss));
++			    (unsigned long)&get_cpu_entry_area(cpu)->tss.x86_tss);
+ 		vmcs_writel(HOST_GDTR_BASE, (unsigned long)gdt);   /* 22.2.4 */
+ 
+ 		/*
+diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c
+index 553f8fd23cc4..4846eff7e4c8 100644
+--- a/arch/x86/lib/delay.c
++++ b/arch/x86/lib/delay.c
+@@ -107,10 +107,10 @@ static void delay_mwaitx(unsigned long __loops)
+ 		delay = min_t(u64, MWAITX_MAX_LOOPS, loops);
+ 
+ 		/*
+-		 * Use cpu_tss as a cacheline-aligned, seldomly
++		 * Use cpu_tss_rw as a cacheline-aligned, seldomly
+ 		 * accessed per-cpu variable as the monitor target.
+ 		 */
+-		__monitorx(raw_cpu_ptr(&cpu_tss), 0, 0);
++		__monitorx(raw_cpu_ptr(&cpu_tss_rw), 0, 0);
+ 
+ 		/*
+ 		 * AMD, like Intel, supports the EAX hint and EAX=0xf
+diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
+index b0ff378650a9..3109ba6c6ede 100644
+--- a/arch/x86/mm/fault.c
++++ b/arch/x86/mm/fault.c
+@@ -29,26 +29,6 @@
+ #define CREATE_TRACE_POINTS
+ #include <asm/trace/exceptions.h>
+ 
+-/*
+- * Page fault error code bits:
+- *
+- *   bit 0 ==	 0: no page found	1: protection fault
+- *   bit 1 ==	 0: read access		1: write access
+- *   bit 2 ==	 0: kernel-mode access	1: user-mode access
+- *   bit 3 ==				1: use of reserved bit detected
+- *   bit 4 ==				1: fault was an instruction fetch
+- *   bit 5 ==				1: protection keys block access
+- */
+-enum x86_pf_error_code {
+-
+-	PF_PROT		=		1 << 0,
+-	PF_WRITE	=		1 << 1,
+-	PF_USER		=		1 << 2,
+-	PF_RSVD		=		1 << 3,
+-	PF_INSTR	=		1 << 4,
+-	PF_PK		=		1 << 5,
+-};
+-
+ /*
+  * Returns 0 if mmiotrace is disabled, or if the fault is not
+  * handled by mmiotrace:
+@@ -150,7 +130,7 @@ is_prefetch(struct pt_regs *regs, unsigned long error_code, unsigned long addr)
+ 	 * If it was a exec (instruction fetch) fault on NX page, then
+ 	 * do not ignore the fault:
+ 	 */
+-	if (error_code & PF_INSTR)
++	if (error_code & X86_PF_INSTR)
+ 		return 0;
+ 
+ 	instr = (void *)convert_ip_to_linear(current, regs);
+@@ -180,7 +160,7 @@ is_prefetch(struct pt_regs *regs, unsigned long error_code, unsigned long addr)
+  * siginfo so userspace can discover which protection key was set
+  * on the PTE.
+  *
+- * If we get here, we know that the hardware signaled a PF_PK
++ * If we get here, we know that the hardware signaled a X86_PF_PK
+  * fault and that there was a VMA once we got in the fault
+  * handler.  It does *not* guarantee that the VMA we find here
+  * was the one that we faulted on.
+@@ -205,7 +185,7 @@ static void fill_sig_info_pkey(int si_code, siginfo_t *info, u32 *pkey)
+ 	/*
+ 	 * force_sig_info_fault() is called from a number of
+ 	 * contexts, some of which have a VMA and some of which
+-	 * do not.  The PF_PK handing happens after we have a
++	 * do not.  The X86_PF_PK handing happens after we have a
+ 	 * valid VMA, so we should never reach this without a
+ 	 * valid VMA.
+ 	 */
+@@ -698,7 +678,7 @@ show_fault_oops(struct pt_regs *regs, unsigned long error_code,
+ 	if (!oops_may_print())
+ 		return;
+ 
+-	if (error_code & PF_INSTR) {
++	if (error_code & X86_PF_INSTR) {
+ 		unsigned int level;
+ 		pgd_t *pgd;
+ 		pte_t *pte;
+@@ -780,7 +760,7 @@ no_context(struct pt_regs *regs, unsigned long error_code,
+ 		 */
+ 		if (current->thread.sig_on_uaccess_err && signal) {
+ 			tsk->thread.trap_nr = X86_TRAP_PF;
+-			tsk->thread.error_code = error_code | PF_USER;
++			tsk->thread.error_code = error_code | X86_PF_USER;
+ 			tsk->thread.cr2 = address;
+ 
+ 			/* XXX: hwpoison faults will set the wrong code. */
+@@ -898,7 +878,7 @@ __bad_area_nosemaphore(struct pt_regs *regs, unsigned long error_code,
+ 	struct task_struct *tsk = current;
+ 
+ 	/* User mode accesses just cause a SIGSEGV */
+-	if (error_code & PF_USER) {
++	if (error_code & X86_PF_USER) {
+ 		/*
+ 		 * It's possible to have interrupts off here:
+ 		 */
+@@ -919,7 +899,7 @@ __bad_area_nosemaphore(struct pt_regs *regs, unsigned long error_code,
+ 		 * Instruction fetch faults in the vsyscall page might need
+ 		 * emulation.
+ 		 */
+-		if (unlikely((error_code & PF_INSTR) &&
++		if (unlikely((error_code & X86_PF_INSTR) &&
+ 			     ((address & ~0xfff) == VSYSCALL_ADDR))) {
+ 			if (emulate_vsyscall(regs, address))
+ 				return;
+@@ -932,7 +912,7 @@ __bad_area_nosemaphore(struct pt_regs *regs, unsigned long error_code,
+ 		 * are always protection faults.
+ 		 */
+ 		if (address >= TASK_SIZE_MAX)
+-			error_code |= PF_PROT;
++			error_code |= X86_PF_PROT;
+ 
+ 		if (likely(show_unhandled_signals))
+ 			show_signal_msg(regs, error_code, address, tsk);
+@@ -993,11 +973,11 @@ static inline bool bad_area_access_from_pkeys(unsigned long error_code,
+ 
+ 	if (!boot_cpu_has(X86_FEATURE_OSPKE))
+ 		return false;
+-	if (error_code & PF_PK)
++	if (error_code & X86_PF_PK)
+ 		return true;
+ 	/* this checks permission keys on the VMA: */
+-	if (!arch_vma_access_permitted(vma, (error_code & PF_WRITE),
+-				(error_code & PF_INSTR), foreign))
++	if (!arch_vma_access_permitted(vma, (error_code & X86_PF_WRITE),
++				       (error_code & X86_PF_INSTR), foreign))
+ 		return true;
+ 	return false;
+ }
+@@ -1025,7 +1005,7 @@ do_sigbus(struct pt_regs *regs, unsigned long error_code, unsigned long address,
+ 	int code = BUS_ADRERR;
+ 
+ 	/* Kernel mode? Handle exceptions or die: */
+-	if (!(error_code & PF_USER)) {
++	if (!(error_code & X86_PF_USER)) {
+ 		no_context(regs, error_code, address, SIGBUS, BUS_ADRERR);
+ 		return;
+ 	}
+@@ -1053,14 +1033,14 @@ static noinline void
+ mm_fault_error(struct pt_regs *regs, unsigned long error_code,
+ 	       unsigned long address, u32 *pkey, unsigned int fault)
+ {
+-	if (fatal_signal_pending(current) && !(error_code & PF_USER)) {
++	if (fatal_signal_pending(current) && !(error_code & X86_PF_USER)) {
+ 		no_context(regs, error_code, address, 0, 0);
+ 		return;
+ 	}
+ 
+ 	if (fault & VM_FAULT_OOM) {
+ 		/* Kernel mode? Handle exceptions or die: */
+-		if (!(error_code & PF_USER)) {
++		if (!(error_code & X86_PF_USER)) {
+ 			no_context(regs, error_code, address,
+ 				   SIGSEGV, SEGV_MAPERR);
+ 			return;
+@@ -1085,16 +1065,16 @@ mm_fault_error(struct pt_regs *regs, unsigned long error_code,
+ 
+ static int spurious_fault_check(unsigned long error_code, pte_t *pte)
+ {
+-	if ((error_code & PF_WRITE) && !pte_write(*pte))
++	if ((error_code & X86_PF_WRITE) && !pte_write(*pte))
+ 		return 0;
+ 
+-	if ((error_code & PF_INSTR) && !pte_exec(*pte))
++	if ((error_code & X86_PF_INSTR) && !pte_exec(*pte))
+ 		return 0;
+ 	/*
+ 	 * Note: We do not do lazy flushing on protection key
+-	 * changes, so no spurious fault will ever set PF_PK.
++	 * changes, so no spurious fault will ever set X86_PF_PK.
+ 	 */
+-	if ((error_code & PF_PK))
++	if ((error_code & X86_PF_PK))
+ 		return 1;
+ 
+ 	return 1;
+@@ -1140,8 +1120,8 @@ spurious_fault(unsigned long error_code, unsigned long address)
+ 	 * change, so user accesses are not expected to cause spurious
+ 	 * faults.
+ 	 */
+-	if (error_code != (PF_WRITE | PF_PROT)
+-	    && error_code != (PF_INSTR | PF_PROT))
++	if (error_code != (X86_PF_WRITE | X86_PF_PROT) &&
++	    error_code != (X86_PF_INSTR | X86_PF_PROT))
+ 		return 0;
+ 
+ 	pgd = init_mm.pgd + pgd_index(address);
+@@ -1201,19 +1181,19 @@ access_error(unsigned long error_code, struct vm_area_struct *vma)
+ 	 * always an unconditional error and can never result in
+ 	 * a follow-up action to resolve the fault, like a COW.
+ 	 */
+-	if (error_code & PF_PK)
++	if (error_code & X86_PF_PK)
+ 		return 1;
+ 
+ 	/*
+ 	 * Make sure to check the VMA so that we do not perform
+-	 * faults just to hit a PF_PK as soon as we fill in a
++	 * faults just to hit a X86_PF_PK as soon as we fill in a
+ 	 * page.
+ 	 */
+-	if (!arch_vma_access_permitted(vma, (error_code & PF_WRITE),
+-				(error_code & PF_INSTR), foreign))
++	if (!arch_vma_access_permitted(vma, (error_code & X86_PF_WRITE),
++				       (error_code & X86_PF_INSTR), foreign))
+ 		return 1;
+ 
+-	if (error_code & PF_WRITE) {
++	if (error_code & X86_PF_WRITE) {
+ 		/* write, present and write, not present: */
+ 		if (unlikely(!(vma->vm_flags & VM_WRITE)))
+ 			return 1;
+@@ -1221,7 +1201,7 @@ access_error(unsigned long error_code, struct vm_area_struct *vma)
+ 	}
+ 
+ 	/* read, present: */
+-	if (unlikely(error_code & PF_PROT))
++	if (unlikely(error_code & X86_PF_PROT))
+ 		return 1;
+ 
+ 	/* read, not present: */
+@@ -1244,7 +1224,7 @@ static inline bool smap_violation(int error_code, struct pt_regs *regs)
+ 	if (!static_cpu_has(X86_FEATURE_SMAP))
+ 		return false;
+ 
+-	if (error_code & PF_USER)
++	if (error_code & X86_PF_USER)
+ 		return false;
+ 
+ 	if (!user_mode(regs) && (regs->flags & X86_EFLAGS_AC))
+@@ -1297,7 +1277,7 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
+ 	 * protection error (error_code & 9) == 0.
+ 	 */
+ 	if (unlikely(fault_in_kernel_space(address))) {
+-		if (!(error_code & (PF_RSVD | PF_USER | PF_PROT))) {
++		if (!(error_code & (X86_PF_RSVD | X86_PF_USER | X86_PF_PROT))) {
+ 			if (vmalloc_fault(address) >= 0)
+ 				return;
+ 
+@@ -1325,7 +1305,7 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
+ 	if (unlikely(kprobes_fault(regs)))
+ 		return;
+ 
+-	if (unlikely(error_code & PF_RSVD))
++	if (unlikely(error_code & X86_PF_RSVD))
+ 		pgtable_bad(regs, error_code, address);
+ 
+ 	if (unlikely(smap_violation(error_code, regs))) {
+@@ -1351,7 +1331,7 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
+ 	 */
+ 	if (user_mode(regs)) {
+ 		local_irq_enable();
+-		error_code |= PF_USER;
++		error_code |= X86_PF_USER;
+ 		flags |= FAULT_FLAG_USER;
+ 	} else {
+ 		if (regs->flags & X86_EFLAGS_IF)
+@@ -1360,9 +1340,9 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
+ 
+ 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
+ 
+-	if (error_code & PF_WRITE)
++	if (error_code & X86_PF_WRITE)
+ 		flags |= FAULT_FLAG_WRITE;
+-	if (error_code & PF_INSTR)
++	if (error_code & X86_PF_INSTR)
+ 		flags |= FAULT_FLAG_INSTRUCTION;
+ 
+ 	/*
+@@ -1382,7 +1362,7 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
+ 	 * space check, thus avoiding the deadlock:
+ 	 */
+ 	if (unlikely(!down_read_trylock(&mm->mmap_sem))) {
+-		if ((error_code & PF_USER) == 0 &&
++		if (!(error_code & X86_PF_USER) &&
+ 		    !search_exception_tables(regs->ip)) {
+ 			bad_area_nosemaphore(regs, error_code, address, NULL);
+ 			return;
+@@ -1409,7 +1389,7 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
+ 		bad_area(regs, error_code, address);
+ 		return;
+ 	}
+-	if (error_code & PF_USER) {
++	if (error_code & X86_PF_USER) {
+ 		/*
+ 		 * Accessing the stack below %sp is always a bug.
+ 		 * The large cushion allows instructions like enter
+diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
+index af5c1ed21d43..a22c2b95e513 100644
+--- a/arch/x86/mm/init.c
++++ b/arch/x86/mm/init.c
+@@ -671,7 +671,7 @@ void __init init_mem_mapping(void)
+ 	load_cr3(swapper_pg_dir);
+ 	__flush_tlb_all();
+ 
+-	hypervisor_init_mem_mapping();
++	x86_init.hyper.init_mem_mapping();
+ 
+ 	early_memtest(0, max_pfn_mapped << PAGE_SHIFT);
+ }
+diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
+index 048fbe8fc274..adcea90a2046 100644
+--- a/arch/x86/mm/init_64.c
++++ b/arch/x86/mm/init_64.c
+@@ -1426,16 +1426,16 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node)
+ 
+ #if defined(CONFIG_MEMORY_HOTPLUG_SPARSE) && defined(CONFIG_HAVE_BOOTMEM_INFO_NODE)
+ void register_page_bootmem_memmap(unsigned long section_nr,
+-				  struct page *start_page, unsigned long size)
++				  struct page *start_page, unsigned long nr_pages)
+ {
+ 	unsigned long addr = (unsigned long)start_page;
+-	unsigned long end = (unsigned long)(start_page + size);
++	unsigned long end = (unsigned long)(start_page + nr_pages);
+ 	unsigned long next;
+ 	pgd_t *pgd;
+ 	p4d_t *p4d;
+ 	pud_t *pud;
+ 	pmd_t *pmd;
+-	unsigned int nr_pages;
++	unsigned int nr_pmd_pages;
+ 	struct page *page;
+ 
+ 	for (; addr < end; addr = next) {
+@@ -1482,9 +1482,9 @@ void register_page_bootmem_memmap(unsigned long section_nr,
+ 			if (pmd_none(*pmd))
+ 				continue;
+ 
+-			nr_pages = 1 << (get_order(PMD_SIZE));
++			nr_pmd_pages = 1 << get_order(PMD_SIZE);
+ 			page = pmd_page(*pmd);
+-			while (nr_pages--)
++			while (nr_pmd_pages--)
+ 				get_page_bootmem(section_nr, page++,
+ 						 SECTION_INFO);
+ 		}
+diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
+index 8f5be3eb40dd..9ec70d780f1f 100644
+--- a/arch/x86/mm/kasan_init_64.c
++++ b/arch/x86/mm/kasan_init_64.c
+@@ -4,19 +4,150 @@
+ #include <linux/bootmem.h>
+ #include <linux/kasan.h>
+ #include <linux/kdebug.h>
++#include <linux/memblock.h>
+ #include <linux/mm.h>
+ #include <linux/sched.h>
+ #include <linux/sched/task.h>
+ #include <linux/vmalloc.h>
+ 
+ #include <asm/e820/types.h>
++#include <asm/pgalloc.h>
+ #include <asm/tlbflush.h>
+ #include <asm/sections.h>
+ #include <asm/pgtable.h>
+ 
+ extern struct range pfn_mapped[E820_MAX_ENTRIES];
+ 
+-static int __init map_range(struct range *range)
++static p4d_t tmp_p4d_table[PTRS_PER_P4D] __initdata __aligned(PAGE_SIZE);
++
++static __init void *early_alloc(size_t size, int nid)
++{
++	return memblock_virt_alloc_try_nid_nopanic(size, size,
++		__pa(MAX_DMA_ADDRESS), BOOTMEM_ALLOC_ACCESSIBLE, nid);
++}
++
++static void __init kasan_populate_pmd(pmd_t *pmd, unsigned long addr,
++				      unsigned long end, int nid)
++{
++	pte_t *pte;
++
++	if (pmd_none(*pmd)) {
++		void *p;
++
++		if (boot_cpu_has(X86_FEATURE_PSE) &&
++		    ((end - addr) == PMD_SIZE) &&
++		    IS_ALIGNED(addr, PMD_SIZE)) {
++			p = early_alloc(PMD_SIZE, nid);
++			if (p && pmd_set_huge(pmd, __pa(p), PAGE_KERNEL))
++				return;
++			else if (p)
++				memblock_free(__pa(p), PMD_SIZE);
++		}
++
++		p = early_alloc(PAGE_SIZE, nid);
++		pmd_populate_kernel(&init_mm, pmd, p);
++	}
++
++	pte = pte_offset_kernel(pmd, addr);
++	do {
++		pte_t entry;
++		void *p;
++
++		if (!pte_none(*pte))
++			continue;
++
++		p = early_alloc(PAGE_SIZE, nid);
++		entry = pfn_pte(PFN_DOWN(__pa(p)), PAGE_KERNEL);
++		set_pte_at(&init_mm, addr, pte, entry);
++	} while (pte++, addr += PAGE_SIZE, addr != end);
++}
++
++static void __init kasan_populate_pud(pud_t *pud, unsigned long addr,
++				      unsigned long end, int nid)
++{
++	pmd_t *pmd;
++	unsigned long next;
++
++	if (pud_none(*pud)) {
++		void *p;
++
++		if (boot_cpu_has(X86_FEATURE_GBPAGES) &&
++		    ((end - addr) == PUD_SIZE) &&
++		    IS_ALIGNED(addr, PUD_SIZE)) {
++			p = early_alloc(PUD_SIZE, nid);
++			if (p && pud_set_huge(pud, __pa(p), PAGE_KERNEL))
++				return;
++			else if (p)
++				memblock_free(__pa(p), PUD_SIZE);
++		}
++
++		p = early_alloc(PAGE_SIZE, nid);
++		pud_populate(&init_mm, pud, p);
++	}
++
++	pmd = pmd_offset(pud, addr);
++	do {
++		next = pmd_addr_end(addr, end);
++		if (!pmd_large(*pmd))
++			kasan_populate_pmd(pmd, addr, next, nid);
++	} while (pmd++, addr = next, addr != end);
++}
++
++static void __init kasan_populate_p4d(p4d_t *p4d, unsigned long addr,
++				      unsigned long end, int nid)
++{
++	pud_t *pud;
++	unsigned long next;
++
++	if (p4d_none(*p4d)) {
++		void *p = early_alloc(PAGE_SIZE, nid);
++
++		p4d_populate(&init_mm, p4d, p);
++	}
++
++	pud = pud_offset(p4d, addr);
++	do {
++		next = pud_addr_end(addr, end);
++		if (!pud_large(*pud))
++			kasan_populate_pud(pud, addr, next, nid);
++	} while (pud++, addr = next, addr != end);
++}
++
++static void __init kasan_populate_pgd(pgd_t *pgd, unsigned long addr,
++				      unsigned long end, int nid)
++{
++	void *p;
++	p4d_t *p4d;
++	unsigned long next;
++
++	if (pgd_none(*pgd)) {
++		p = early_alloc(PAGE_SIZE, nid);
++		pgd_populate(&init_mm, pgd, p);
++	}
++
++	p4d = p4d_offset(pgd, addr);
++	do {
++		next = p4d_addr_end(addr, end);
++		kasan_populate_p4d(p4d, addr, next, nid);
++	} while (p4d++, addr = next, addr != end);
++}
++
++static void __init kasan_populate_shadow(unsigned long addr, unsigned long end,
++					 int nid)
++{
++	pgd_t *pgd;
++	unsigned long next;
++
++	addr = addr & PAGE_MASK;
++	end = round_up(end, PAGE_SIZE);
++	pgd = pgd_offset_k(addr);
++	do {
++		next = pgd_addr_end(addr, end);
++		kasan_populate_pgd(pgd, addr, next, nid);
++	} while (pgd++, addr = next, addr != end);
++}
++
++static void __init map_range(struct range *range)
+ {
+ 	unsigned long start;
+ 	unsigned long end;
+@@ -24,15 +155,17 @@ static int __init map_range(struct range *range)
+ 	start = (unsigned long)kasan_mem_to_shadow(pfn_to_kaddr(range->start));
+ 	end = (unsigned long)kasan_mem_to_shadow(pfn_to_kaddr(range->end));
+ 
+-	return vmemmap_populate(start, end, NUMA_NO_NODE);
++	kasan_populate_shadow(start, end, early_pfn_to_nid(range->start));
+ }
+ 
+ static void __init clear_pgds(unsigned long start,
+ 			unsigned long end)
+ {
+ 	pgd_t *pgd;
++	/* See comment in kasan_init() */
++	unsigned long pgd_end = end & PGDIR_MASK;
+ 
+-	for (; start < end; start += PGDIR_SIZE) {
++	for (; start < pgd_end; start += PGDIR_SIZE) {
+ 		pgd = pgd_offset_k(start);
+ 		/*
+ 		 * With folded p4d, pgd_clear() is nop, use p4d_clear()
+@@ -43,29 +176,61 @@ static void __init clear_pgds(unsigned long start,
+ 		else
+ 			pgd_clear(pgd);
+ 	}
++
++	pgd = pgd_offset_k(start);
++	for (; start < end; start += P4D_SIZE)
++		p4d_clear(p4d_offset(pgd, start));
++}
++
++static inline p4d_t *early_p4d_offset(pgd_t *pgd, unsigned long addr)
++{
++	unsigned long p4d;
++
++	if (!IS_ENABLED(CONFIG_X86_5LEVEL))
++		return (p4d_t *)pgd;
++
++	p4d = __pa_nodebug(pgd_val(*pgd)) & PTE_PFN_MASK;
++	p4d += __START_KERNEL_map - phys_base;
++	return (p4d_t *)p4d + p4d_index(addr);
++}
++
++static void __init kasan_early_p4d_populate(pgd_t *pgd,
++		unsigned long addr,
++		unsigned long end)
++{
++	pgd_t pgd_entry;
++	p4d_t *p4d, p4d_entry;
++	unsigned long next;
++
++	if (pgd_none(*pgd)) {
++		pgd_entry = __pgd(_KERNPG_TABLE | __pa_nodebug(kasan_zero_p4d));
++		set_pgd(pgd, pgd_entry);
++	}
++
++	p4d = early_p4d_offset(pgd, addr);
++	do {
++		next = p4d_addr_end(addr, end);
++
++		if (!p4d_none(*p4d))
++			continue;
++
++		p4d_entry = __p4d(_KERNPG_TABLE | __pa_nodebug(kasan_zero_pud));
++		set_p4d(p4d, p4d_entry);
++	} while (p4d++, addr = next, addr != end && p4d_none(*p4d));
+ }
+ 
+ static void __init kasan_map_early_shadow(pgd_t *pgd)
+ {
+-	int i;
+-	unsigned long start = KASAN_SHADOW_START;
++	/* See comment in kasan_init() */
++	unsigned long addr = KASAN_SHADOW_START & PGDIR_MASK;
+ 	unsigned long end = KASAN_SHADOW_END;
++	unsigned long next;
+ 
+-	for (i = pgd_index(start); start < end; i++) {
+-		switch (CONFIG_PGTABLE_LEVELS) {
+-		case 4:
+-			pgd[i] = __pgd(__pa_nodebug(kasan_zero_pud) |
+-					_KERNPG_TABLE);
+-			break;
+-		case 5:
+-			pgd[i] = __pgd(__pa_nodebug(kasan_zero_p4d) |
+-					_KERNPG_TABLE);
+-			break;
+-		default:
+-			BUILD_BUG();
+-		}
+-		start += PGDIR_SIZE;
+-	}
++	pgd += pgd_index(addr);
++	do {
++		next = pgd_addr_end(addr, end);
++		kasan_early_p4d_populate(pgd, addr, next);
++	} while (pgd++, addr = next, addr != end);
+ }
+ 
+ #ifdef CONFIG_KASAN_INLINE
+@@ -102,7 +267,7 @@ void __init kasan_early_init(void)
+ 	for (i = 0; i < PTRS_PER_PUD; i++)
+ 		kasan_zero_pud[i] = __pud(pud_val);
+ 
+-	for (i = 0; CONFIG_PGTABLE_LEVELS >= 5 && i < PTRS_PER_P4D; i++)
++	for (i = 0; IS_ENABLED(CONFIG_X86_5LEVEL) && i < PTRS_PER_P4D; i++)
+ 		kasan_zero_p4d[i] = __p4d(p4d_val);
+ 
+ 	kasan_map_early_shadow(early_top_pgt);
+@@ -112,37 +277,76 @@ void __init kasan_early_init(void)
+ void __init kasan_init(void)
+ {
+ 	int i;
++	void *shadow_cpu_entry_begin, *shadow_cpu_entry_end;
+ 
+ #ifdef CONFIG_KASAN_INLINE
+ 	register_die_notifier(&kasan_die_notifier);
+ #endif
+ 
+ 	memcpy(early_top_pgt, init_top_pgt, sizeof(early_top_pgt));
++
++	/*
++	 * We use the same shadow offset for 4- and 5-level paging to
++	 * facilitate boot-time switching between paging modes.
++	 * As result in 5-level paging mode KASAN_SHADOW_START and
++	 * KASAN_SHADOW_END are not aligned to PGD boundary.
++	 *
++	 * KASAN_SHADOW_START doesn't share PGD with anything else.
++	 * We claim whole PGD entry to make things easier.
++	 *
++	 * KASAN_SHADOW_END lands in the last PGD entry and it collides with
++	 * bunch of things like kernel code, modules, EFI mapping, etc.
++	 * We need to take extra steps to not overwrite them.
++	 */
++	if (IS_ENABLED(CONFIG_X86_5LEVEL)) {
++		void *ptr;
++
++		ptr = (void *)pgd_page_vaddr(*pgd_offset_k(KASAN_SHADOW_END));
++		memcpy(tmp_p4d_table, (void *)ptr, sizeof(tmp_p4d_table));
++		set_pgd(&early_top_pgt[pgd_index(KASAN_SHADOW_END)],
++				__pgd(__pa(tmp_p4d_table) | _KERNPG_TABLE));
++	}
++
+ 	load_cr3(early_top_pgt);
+ 	__flush_tlb_all();
+ 
+-	clear_pgds(KASAN_SHADOW_START, KASAN_SHADOW_END);
++	clear_pgds(KASAN_SHADOW_START & PGDIR_MASK, KASAN_SHADOW_END);
+ 
+-	kasan_populate_zero_shadow((void *)KASAN_SHADOW_START,
++	kasan_populate_zero_shadow((void *)(KASAN_SHADOW_START & PGDIR_MASK),
+ 			kasan_mem_to_shadow((void *)PAGE_OFFSET));
+ 
+ 	for (i = 0; i < E820_MAX_ENTRIES; i++) {
+ 		if (pfn_mapped[i].end == 0)
+ 			break;
+ 
+-		if (map_range(&pfn_mapped[i]))
+-			panic("kasan: unable to allocate shadow!");
++		map_range(&pfn_mapped[i]);
+ 	}
++
+ 	kasan_populate_zero_shadow(
+ 		kasan_mem_to_shadow((void *)PAGE_OFFSET + MAXMEM),
+ 		kasan_mem_to_shadow((void *)__START_KERNEL_map));
+ 
+-	vmemmap_populate((unsigned long)kasan_mem_to_shadow(_stext),
+-			(unsigned long)kasan_mem_to_shadow(_end),
+-			NUMA_NO_NODE);
++	kasan_populate_shadow((unsigned long)kasan_mem_to_shadow(_stext),
++			      (unsigned long)kasan_mem_to_shadow(_end),
++			      early_pfn_to_nid(__pa(_stext)));
++
++	shadow_cpu_entry_begin = (void *)__fix_to_virt(FIX_CPU_ENTRY_AREA_BOTTOM);
++	shadow_cpu_entry_begin = kasan_mem_to_shadow(shadow_cpu_entry_begin);
++	shadow_cpu_entry_begin = (void *)round_down((unsigned long)shadow_cpu_entry_begin,
++						PAGE_SIZE);
++
++	shadow_cpu_entry_end = (void *)(__fix_to_virt(FIX_CPU_ENTRY_AREA_TOP) + PAGE_SIZE);
++	shadow_cpu_entry_end = kasan_mem_to_shadow(shadow_cpu_entry_end);
++	shadow_cpu_entry_end = (void *)round_up((unsigned long)shadow_cpu_entry_end,
++					PAGE_SIZE);
+ 
+ 	kasan_populate_zero_shadow(kasan_mem_to_shadow((void *)MODULES_END),
+-			(void *)KASAN_SHADOW_END);
++				   shadow_cpu_entry_begin);
++
++	kasan_populate_shadow((unsigned long)shadow_cpu_entry_begin,
++			      (unsigned long)shadow_cpu_entry_end, 0);
++
++	kasan_populate_zero_shadow(shadow_cpu_entry_end, (void *)KASAN_SHADOW_END);
+ 
+ 	load_cr3(init_top_pgt);
+ 	__flush_tlb_all();
+diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c
+index 84fcfde53f8f..04d5157fe7f8 100644
+--- a/arch/x86/power/cpu.c
++++ b/arch/x86/power/cpu.c
+@@ -160,17 +160,19 @@ static void do_fpu_end(void)
+ static void fix_processor_context(void)
+ {
+ 	int cpu = smp_processor_id();
+-	struct tss_struct *t = &per_cpu(cpu_tss, cpu);
+ #ifdef CONFIG_X86_64
+ 	struct desc_struct *desc = get_cpu_gdt_rw(cpu);
+ 	tss_desc tss;
+ #endif
+-	set_tss_desc(cpu, t);	/*
+-				 * This just modifies memory; should not be
+-				 * necessary. But... This is necessary, because
+-				 * 386 hardware has concept of busy TSS or some
+-				 * similar stupidity.
+-				 */
++
++	/*
++	 * We need to reload TR, which requires that we change the
++	 * GDT entry to indicate "available" first.
++	 *
++	 * XXX: This could probably all be replaced by a call to
++	 * force_reload_TR().
++	 */
++	set_tss_desc(cpu, &get_cpu_entry_area(cpu)->tss.x86_tss);
+ 
+ #ifdef CONFIG_X86_64
+ 	memcpy(&tss, &desc[GDT_ENTRY_TSS], sizeof(tss_desc));
+diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
+index de503c225ae1..754d5391d9fa 100644
+--- a/arch/x86/xen/enlighten_hvm.c
++++ b/arch/x86/xen/enlighten_hvm.c
+@@ -226,12 +226,12 @@ static uint32_t __init xen_platform_hvm(void)
+ 	return xen_cpuid_base();
+ }
+ 
+-const struct hypervisor_x86 x86_hyper_xen_hvm = {
++const __initconst struct hypervisor_x86 x86_hyper_xen_hvm = {
+ 	.name                   = "Xen HVM",
+ 	.detect                 = xen_platform_hvm,
+-	.init_platform          = xen_hvm_guest_init,
+-	.pin_vcpu               = xen_pin_vcpu,
+-	.x2apic_available       = xen_x2apic_para_available,
+-	.init_mem_mapping	= xen_hvm_init_mem_mapping,
++	.type			= X86_HYPER_XEN_HVM,
++	.init.init_platform     = xen_hvm_guest_init,
++	.init.x2apic_available  = xen_x2apic_para_available,
++	.init.init_mem_mapping	= xen_hvm_init_mem_mapping,
++	.runtime.pin_vcpu       = xen_pin_vcpu,
+ };
+-EXPORT_SYMBOL(x86_hyper_xen_hvm);
+diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
+index d4396e27b1fb..ae3a071e1d0f 100644
+--- a/arch/x86/xen/enlighten_pv.c
++++ b/arch/x86/xen/enlighten_pv.c
+@@ -601,7 +601,7 @@ static struct trap_array_entry trap_array[] = {
+ #ifdef CONFIG_X86_MCE
+ 	{ machine_check,               xen_machine_check,               true },
+ #endif
+-	{ nmi,                         xen_nmi,                         true },
++	{ nmi,                         xen_xennmi,                      true },
+ 	{ overflow,                    xen_overflow,                    false },
+ #ifdef CONFIG_IA32_EMULATION
+ 	{ entry_INT80_compat,          xen_entry_INT80_compat,          false },
+@@ -811,15 +811,14 @@ static void __init xen_write_gdt_entry_boot(struct desc_struct *dt, int entry,
+ 	}
+ }
+ 
+-static void xen_load_sp0(struct tss_struct *tss,
+-			 struct thread_struct *thread)
++static void xen_load_sp0(unsigned long sp0)
+ {
+ 	struct multicall_space mcs;
+ 
+ 	mcs = xen_mc_entry(0);
+-	MULTI_stack_switch(mcs.mc, __KERNEL_DS, thread->sp0);
++	MULTI_stack_switch(mcs.mc, __KERNEL_DS, sp0);
+ 	xen_mc_issue(PARAVIRT_LAZY_CPU);
+-	tss->x86_tss.sp0 = thread->sp0;
++	this_cpu_write(cpu_tss_rw.x86_tss.sp0, sp0);
+ }
+ 
+ void xen_set_iopl_mask(unsigned mask)
+@@ -1460,9 +1459,9 @@ static uint32_t __init xen_platform_pv(void)
+ 	return 0;
+ }
+ 
+-const struct hypervisor_x86 x86_hyper_xen_pv = {
++const __initconst struct hypervisor_x86 x86_hyper_xen_pv = {
+ 	.name                   = "Xen PV",
+ 	.detect                 = xen_platform_pv,
+-	.pin_vcpu               = xen_pin_vcpu,
++	.type			= X86_HYPER_XEN_PV,
++	.runtime.pin_vcpu       = xen_pin_vcpu,
+ };
+-EXPORT_SYMBOL(x86_hyper_xen_pv);
+diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c
+index 71495f1a86d7..c2454237fa67 100644
+--- a/arch/x86/xen/mmu_pv.c
++++ b/arch/x86/xen/mmu_pv.c
+@@ -449,7 +449,7 @@ __visible pmd_t xen_make_pmd(pmdval_t pmd)
+ }
+ PV_CALLEE_SAVE_REGS_THUNK(xen_make_pmd);
+ 
+-#if CONFIG_PGTABLE_LEVELS == 4
++#ifdef CONFIG_X86_64
+ __visible pudval_t xen_pud_val(pud_t pud)
+ {
+ 	return pte_mfn_to_pfn(pud.pud);
+@@ -538,7 +538,7 @@ static void xen_set_p4d(p4d_t *ptr, p4d_t val)
+ 
+ 	xen_mc_issue(PARAVIRT_LAZY_MMU);
+ }
+-#endif	/* CONFIG_PGTABLE_LEVELS == 4 */
++#endif	/* CONFIG_X86_64 */
+ 
+ static int xen_pmd_walk(struct mm_struct *mm, pmd_t *pmd,
+ 		int (*func)(struct mm_struct *mm, struct page *, enum pt_level),
+@@ -580,21 +580,17 @@ static int xen_p4d_walk(struct mm_struct *mm, p4d_t *p4d,
+ 		int (*func)(struct mm_struct *mm, struct page *, enum pt_level),
+ 		bool last, unsigned long limit)
+ {
+-	int i, nr, flush = 0;
++	int flush = 0;
++	pud_t *pud;
+ 
+-	nr = last ? p4d_index(limit) + 1 : PTRS_PER_P4D;
+-	for (i = 0; i < nr; i++) {
+-		pud_t *pud;
+ 
+-		if (p4d_none(p4d[i]))
+-			continue;
++	if (p4d_none(*p4d))
++		return flush;
+ 
+-		pud = pud_offset(&p4d[i], 0);
+-		if (PTRS_PER_PUD > 1)
+-			flush |= (*func)(mm, virt_to_page(pud), PT_PUD);
+-		flush |= xen_pud_walk(mm, pud, func,
+-				last && i == nr - 1, limit);
+-	}
++	pud = pud_offset(p4d, 0);
++	if (PTRS_PER_PUD > 1)
++		flush |= (*func)(mm, virt_to_page(pud), PT_PUD);
++	flush |= xen_pud_walk(mm, pud, func, last, limit);
+ 	return flush;
+ }
+ 
+@@ -644,8 +640,6 @@ static int __xen_pgd_walk(struct mm_struct *mm, pgd_t *pgd,
+ 			continue;
+ 
+ 		p4d = p4d_offset(&pgd[i], 0);
+-		if (PTRS_PER_P4D > 1)
+-			flush |= (*func)(mm, virt_to_page(p4d), PT_P4D);
+ 		flush |= xen_p4d_walk(mm, p4d, func, i == nr - 1, limit);
+ 	}
+ 
+@@ -1176,22 +1170,14 @@ static void __init xen_cleanmfnmap(unsigned long vaddr)
+ {
+ 	pgd_t *pgd;
+ 	p4d_t *p4d;
+-	unsigned int i;
+ 	bool unpin;
+ 
+ 	unpin = (vaddr == 2 * PGDIR_SIZE);
+ 	vaddr &= PMD_MASK;
+ 	pgd = pgd_offset_k(vaddr);
+ 	p4d = p4d_offset(pgd, 0);
+-	for (i = 0; i < PTRS_PER_P4D; i++) {
+-		if (p4d_none(p4d[i]))
+-			continue;
+-		xen_cleanmfnmap_p4d(p4d + i, unpin);
+-	}
+-	if (IS_ENABLED(CONFIG_X86_5LEVEL)) {
+-		set_pgd(pgd, __pgd(0));
+-		xen_cleanmfnmap_free_pgtbl(p4d, unpin);
+-	}
++	if (!p4d_none(*p4d))
++		xen_cleanmfnmap_p4d(p4d, unpin);
+ }
+ 
+ static void __init xen_pagetable_p2m_free(void)
+@@ -1692,7 +1678,7 @@ static void xen_release_pmd(unsigned long pfn)
+ 	xen_release_ptpage(pfn, PT_PMD);
+ }
+ 
+-#if CONFIG_PGTABLE_LEVELS >= 4
++#ifdef CONFIG_X86_64
+ static void xen_alloc_pud(struct mm_struct *mm, unsigned long pfn)
+ {
+ 	xen_alloc_ptpage(mm, pfn, PT_PUD);
+@@ -2029,13 +2015,12 @@ static phys_addr_t __init xen_early_virt_to_phys(unsigned long vaddr)
+  */
+ void __init xen_relocate_p2m(void)
+ {
+-	phys_addr_t size, new_area, pt_phys, pmd_phys, pud_phys, p4d_phys;
++	phys_addr_t size, new_area, pt_phys, pmd_phys, pud_phys;
+ 	unsigned long p2m_pfn, p2m_pfn_end, n_frames, pfn, pfn_end;
+-	int n_pte, n_pt, n_pmd, n_pud, n_p4d, idx_pte, idx_pt, idx_pmd, idx_pud, idx_p4d;
++	int n_pte, n_pt, n_pmd, n_pud, idx_pte, idx_pt, idx_pmd, idx_pud;
+ 	pte_t *pt;
+ 	pmd_t *pmd;
+ 	pud_t *pud;
+-	p4d_t *p4d = NULL;
+ 	pgd_t *pgd;
+ 	unsigned long *new_p2m;
+ 	int save_pud;
+@@ -2045,11 +2030,7 @@ void __init xen_relocate_p2m(void)
+ 	n_pt = roundup(size, PMD_SIZE) >> PMD_SHIFT;
+ 	n_pmd = roundup(size, PUD_SIZE) >> PUD_SHIFT;
+ 	n_pud = roundup(size, P4D_SIZE) >> P4D_SHIFT;
+-	if (PTRS_PER_P4D > 1)
+-		n_p4d = roundup(size, PGDIR_SIZE) >> PGDIR_SHIFT;
+-	else
+-		n_p4d = 0;
+-	n_frames = n_pte + n_pt + n_pmd + n_pud + n_p4d;
++	n_frames = n_pte + n_pt + n_pmd + n_pud;
+ 
+ 	new_area = xen_find_free_area(PFN_PHYS(n_frames));
+ 	if (!new_area) {
+@@ -2065,76 +2046,56 @@ void __init xen_relocate_p2m(void)
+ 	 * To avoid any possible virtual address collision, just use
+ 	 * 2 * PUD_SIZE for the new area.
+ 	 */
+-	p4d_phys = new_area;
+-	pud_phys = p4d_phys + PFN_PHYS(n_p4d);
++	pud_phys = new_area;
+ 	pmd_phys = pud_phys + PFN_PHYS(n_pud);
+ 	pt_phys = pmd_phys + PFN_PHYS(n_pmd);
+ 	p2m_pfn = PFN_DOWN(pt_phys) + n_pt;
+ 
+ 	pgd = __va(read_cr3_pa());
+ 	new_p2m = (unsigned long *)(2 * PGDIR_SIZE);
+-	idx_p4d = 0;
+ 	save_pud = n_pud;
+-	do {
+-		if (n_p4d > 0) {
+-			p4d = early_memremap(p4d_phys, PAGE_SIZE);
+-			clear_page(p4d);
+-			n_pud = min(save_pud, PTRS_PER_P4D);
+-		}
+-		for (idx_pud = 0; idx_pud < n_pud; idx_pud++) {
+-			pud = early_memremap(pud_phys, PAGE_SIZE);
+-			clear_page(pud);
+-			for (idx_pmd = 0; idx_pmd < min(n_pmd, PTRS_PER_PUD);
+-				 idx_pmd++) {
+-				pmd = early_memremap(pmd_phys, PAGE_SIZE);
+-				clear_page(pmd);
+-				for (idx_pt = 0; idx_pt < min(n_pt, PTRS_PER_PMD);
+-					 idx_pt++) {
+-					pt = early_memremap(pt_phys, PAGE_SIZE);
+-					clear_page(pt);
+-					for (idx_pte = 0;
+-						 idx_pte < min(n_pte, PTRS_PER_PTE);
+-						 idx_pte++) {
+-						set_pte(pt + idx_pte,
+-								pfn_pte(p2m_pfn, PAGE_KERNEL));
+-						p2m_pfn++;
+-					}
+-					n_pte -= PTRS_PER_PTE;
+-					early_memunmap(pt, PAGE_SIZE);
+-					make_lowmem_page_readonly(__va(pt_phys));
+-					pin_pagetable_pfn(MMUEXT_PIN_L1_TABLE,
+-							PFN_DOWN(pt_phys));
+-					set_pmd(pmd + idx_pt,
+-							__pmd(_PAGE_TABLE | pt_phys));
+-					pt_phys += PAGE_SIZE;
++	for (idx_pud = 0; idx_pud < n_pud; idx_pud++) {
++		pud = early_memremap(pud_phys, PAGE_SIZE);
++		clear_page(pud);
++		for (idx_pmd = 0; idx_pmd < min(n_pmd, PTRS_PER_PUD);
++				idx_pmd++) {
++			pmd = early_memremap(pmd_phys, PAGE_SIZE);
++			clear_page(pmd);
++			for (idx_pt = 0; idx_pt < min(n_pt, PTRS_PER_PMD);
++					idx_pt++) {
++				pt = early_memremap(pt_phys, PAGE_SIZE);
++				clear_page(pt);
++				for (idx_pte = 0;
++						idx_pte < min(n_pte, PTRS_PER_PTE);
++						idx_pte++) {
++					set_pte(pt + idx_pte,
++							pfn_pte(p2m_pfn, PAGE_KERNEL));
++					p2m_pfn++;
+ 				}
+-				n_pt -= PTRS_PER_PMD;
+-				early_memunmap(pmd, PAGE_SIZE);
+-				make_lowmem_page_readonly(__va(pmd_phys));
+-				pin_pagetable_pfn(MMUEXT_PIN_L2_TABLE,
+-						PFN_DOWN(pmd_phys));
+-				set_pud(pud + idx_pmd, __pud(_PAGE_TABLE | pmd_phys));
+-				pmd_phys += PAGE_SIZE;
++				n_pte -= PTRS_PER_PTE;
++				early_memunmap(pt, PAGE_SIZE);
++				make_lowmem_page_readonly(__va(pt_phys));
++				pin_pagetable_pfn(MMUEXT_PIN_L1_TABLE,
++						PFN_DOWN(pt_phys));
++				set_pmd(pmd + idx_pt,
++						__pmd(_PAGE_TABLE | pt_phys));
++				pt_phys += PAGE_SIZE;
+ 			}
+-			n_pmd -= PTRS_PER_PUD;
+-			early_memunmap(pud, PAGE_SIZE);
+-			make_lowmem_page_readonly(__va(pud_phys));
+-			pin_pagetable_pfn(MMUEXT_PIN_L3_TABLE, PFN_DOWN(pud_phys));
+-			if (n_p4d > 0)
+-				set_p4d(p4d + idx_pud, __p4d(_PAGE_TABLE | pud_phys));
+-			else
+-				set_pgd(pgd + 2 + idx_pud, __pgd(_PAGE_TABLE | pud_phys));
+-			pud_phys += PAGE_SIZE;
+-		}
+-		if (n_p4d > 0) {
+-			save_pud -= PTRS_PER_P4D;
+-			early_memunmap(p4d, PAGE_SIZE);
+-			make_lowmem_page_readonly(__va(p4d_phys));
+-			pin_pagetable_pfn(MMUEXT_PIN_L4_TABLE, PFN_DOWN(p4d_phys));
+-			set_pgd(pgd + 2 + idx_p4d, __pgd(_PAGE_TABLE | p4d_phys));
+-			p4d_phys += PAGE_SIZE;
++			n_pt -= PTRS_PER_PMD;
++			early_memunmap(pmd, PAGE_SIZE);
++			make_lowmem_page_readonly(__va(pmd_phys));
++			pin_pagetable_pfn(MMUEXT_PIN_L2_TABLE,
++					PFN_DOWN(pmd_phys));
++			set_pud(pud + idx_pmd, __pud(_PAGE_TABLE | pmd_phys));
++			pmd_phys += PAGE_SIZE;
+ 		}
+-	} while (++idx_p4d < n_p4d);
++		n_pmd -= PTRS_PER_PUD;
++		early_memunmap(pud, PAGE_SIZE);
++		make_lowmem_page_readonly(__va(pud_phys));
++		pin_pagetable_pfn(MMUEXT_PIN_L3_TABLE, PFN_DOWN(pud_phys));
++		set_pgd(pgd + 2 + idx_pud, __pgd(_PAGE_TABLE | pud_phys));
++		pud_phys += PAGE_SIZE;
++	}
+ 
+ 	/* Now copy the old p2m info to the new area. */
+ 	memcpy(new_p2m, xen_p2m_addr, size);
+@@ -2311,7 +2272,7 @@ static void xen_set_fixmap(unsigned idx, phys_addr_t phys, pgprot_t prot)
+ #endif
+ 	case FIX_TEXT_POKE0:
+ 	case FIX_TEXT_POKE1:
+-	case FIX_GDT_REMAP_BEGIN ... FIX_GDT_REMAP_END:
++	case FIX_CPU_ENTRY_AREA_TOP ... FIX_CPU_ENTRY_AREA_BOTTOM:
+ 		/* All local page mappings */
+ 		pte = pfn_pte(phys, prot);
+ 		break;
+@@ -2361,7 +2322,7 @@ static void __init xen_post_allocator_init(void)
+ 	pv_mmu_ops.set_pte = xen_set_pte;
+ 	pv_mmu_ops.set_pmd = xen_set_pmd;
+ 	pv_mmu_ops.set_pud = xen_set_pud;
+-#if CONFIG_PGTABLE_LEVELS >= 4
++#ifdef CONFIG_X86_64
+ 	pv_mmu_ops.set_p4d = xen_set_p4d;
+ #endif
+ 
+@@ -2371,7 +2332,7 @@ static void __init xen_post_allocator_init(void)
+ 	pv_mmu_ops.alloc_pmd = xen_alloc_pmd;
+ 	pv_mmu_ops.release_pte = xen_release_pte;
+ 	pv_mmu_ops.release_pmd = xen_release_pmd;
+-#if CONFIG_PGTABLE_LEVELS >= 4
++#ifdef CONFIG_X86_64
+ 	pv_mmu_ops.alloc_pud = xen_alloc_pud;
+ 	pv_mmu_ops.release_pud = xen_release_pud;
+ #endif
+@@ -2435,14 +2396,14 @@ static const struct pv_mmu_ops xen_mmu_ops __initconst = {
+ 	.make_pmd = PV_CALLEE_SAVE(xen_make_pmd),
+ 	.pmd_val = PV_CALLEE_SAVE(xen_pmd_val),
+ 
+-#if CONFIG_PGTABLE_LEVELS >= 4
++#ifdef CONFIG_X86_64
+ 	.pud_val = PV_CALLEE_SAVE(xen_pud_val),
+ 	.make_pud = PV_CALLEE_SAVE(xen_make_pud),
+ 	.set_p4d = xen_set_p4d_hyper,
+ 
+ 	.alloc_pud = xen_alloc_pmd_init,
+ 	.release_pud = xen_release_pmd_init,
+-#endif	/* CONFIG_PGTABLE_LEVELS == 4 */
++#endif	/* CONFIG_X86_64 */
+ 
+ 	.activate_mm = xen_activate_mm,
+ 	.dup_mmap = xen_dup_mmap,
+diff --git a/arch/x86/xen/smp_pv.c b/arch/x86/xen/smp_pv.c
+index 05f91ce9b55e..c0c756c76afe 100644
+--- a/arch/x86/xen/smp_pv.c
++++ b/arch/x86/xen/smp_pv.c
+@@ -14,6 +14,7 @@
+  * single-threaded.
+  */
+ #include <linux/sched.h>
++#include <linux/sched/task_stack.h>
+ #include <linux/err.h>
+ #include <linux/slab.h>
+ #include <linux/smp.h>
+@@ -294,12 +295,19 @@ cpu_initialize_context(unsigned int cpu, struct task_struct *idle)
+ #endif
+ 	memset(&ctxt->fpu_ctxt, 0, sizeof(ctxt->fpu_ctxt));
+ 
++	/*
++	 * Bring up the CPU in cpu_bringup_and_idle() with the stack
++	 * pointing just below where pt_regs would be if it were a normal
++	 * kernel entry.
++	 */
+ 	ctxt->user_regs.eip = (unsigned long)cpu_bringup_and_idle;
+ 	ctxt->flags = VGCF_IN_KERNEL;
+ 	ctxt->user_regs.eflags = 0x1000; /* IOPL_RING1 */
+ 	ctxt->user_regs.ds = __USER_DS;
+ 	ctxt->user_regs.es = __USER_DS;
+ 	ctxt->user_regs.ss = __KERNEL_DS;
++	ctxt->user_regs.cs = __KERNEL_CS;
++	ctxt->user_regs.esp = (unsigned long)task_pt_regs(idle);
+ 
+ 	xen_copy_trap_info(ctxt->trap_ctxt);
+ 
+@@ -314,8 +322,13 @@ cpu_initialize_context(unsigned int cpu, struct task_struct *idle)
+ 	ctxt->gdt_frames[0] = gdt_mfn;
+ 	ctxt->gdt_ents      = GDT_ENTRIES;
+ 
++	/*
++	 * Set SS:SP that Xen will use when entering guest kernel mode
++	 * from guest user mode.  Subsequent calls to load_sp0() can
++	 * change this value.
++	 */
+ 	ctxt->kernel_ss = __KERNEL_DS;
+-	ctxt->kernel_sp = idle->thread.sp0;
++	ctxt->kernel_sp = task_top_of_stack(idle);
+ 
+ #ifdef CONFIG_X86_32
+ 	ctxt->event_callback_cs     = __KERNEL_CS;
+@@ -327,10 +340,8 @@ cpu_initialize_context(unsigned int cpu, struct task_struct *idle)
+ 		(unsigned long)xen_hypervisor_callback;
+ 	ctxt->failsafe_callback_eip =
+ 		(unsigned long)xen_failsafe_callback;
+-	ctxt->user_regs.cs = __KERNEL_CS;
+ 	per_cpu(xen_cr3, cpu) = __pa(swapper_pg_dir);
+ 
+-	ctxt->user_regs.esp = idle->thread.sp0 - sizeof(struct pt_regs);
+ 	ctxt->ctrlreg[3] = xen_pfn_to_cr3(virt_to_gfn(swapper_pg_dir));
+ 	if (HYPERVISOR_vcpu_op(VCPUOP_initialise, xen_vcpu_nr(cpu), ctxt))
+ 		BUG();
+diff --git a/arch/x86/xen/xen-asm_64.S b/arch/x86/xen/xen-asm_64.S
+index c98a48c861fd..8a10c9a9e2b5 100644
+--- a/arch/x86/xen/xen-asm_64.S
++++ b/arch/x86/xen/xen-asm_64.S
+@@ -30,7 +30,7 @@ xen_pv_trap debug
+ xen_pv_trap xendebug
+ xen_pv_trap int3
+ xen_pv_trap xenint3
+-xen_pv_trap nmi
++xen_pv_trap xennmi
+ xen_pv_trap overflow
+ xen_pv_trap bounds
+ xen_pv_trap invalid_op
+diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
+index b5b8d7f43557..497cc55a0c16 100644
+--- a/arch/x86/xen/xen-head.S
++++ b/arch/x86/xen/xen-head.S
+@@ -10,6 +10,7 @@
+ #include <asm/boot.h>
+ #include <asm/asm.h>
+ #include <asm/page_types.h>
++#include <asm/unwind_hints.h>
+ 
+ #include <xen/interface/elfnote.h>
+ #include <xen/interface/features.h>
+@@ -20,6 +21,7 @@
+ #ifdef CONFIG_XEN_PV
+ 	__INIT
+ ENTRY(startup_xen)
++	UNWIND_HINT_EMPTY
+ 	cld
+ 
+ 	/* Clear .bss */
+@@ -34,21 +36,24 @@ ENTRY(startup_xen)
+ 	mov $init_thread_union+THREAD_SIZE, %_ASM_SP
+ 
+ 	jmp xen_start_kernel
+-
++END(startup_xen)
+ 	__FINIT
+ #endif
+ 
+ .pushsection .text
+ 	.balign PAGE_SIZE
+ ENTRY(hypercall_page)
+-	.skip PAGE_SIZE
++	.rept (PAGE_SIZE / 32)
++		UNWIND_HINT_EMPTY
++		.skip 32
++	.endr
+ 
+ #define HYPERCALL(n) \
+ 	.equ xen_hypercall_##n, hypercall_page + __HYPERVISOR_##n * 32; \
+ 	.type xen_hypercall_##n, @function; .size xen_hypercall_##n, 32
+ #include <asm/xen-hypercalls.h>
+ #undef HYPERCALL
+-
++END(hypercall_page)
+ .popsection
+ 
+ 	ELFNOTE(Xen, XEN_ELFNOTE_GUEST_OS,       .asciz "linux")
+diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
+index a4783da90ba8..0f860cf0d56d 100644
+--- a/block/bfq-iosched.c
++++ b/block/bfq-iosched.c
+@@ -108,6 +108,7 @@
+ #include "blk-mq-tag.h"
+ #include "blk-mq-sched.h"
+ #include "bfq-iosched.h"
++#include "blk-wbt.h"
+ 
+ #define BFQ_BFQQ_FNS(name)						\
+ void bfq_mark_bfqq_##name(struct bfq_queue *bfqq)			\
+@@ -4775,7 +4776,7 @@ static int bfq_init_queue(struct request_queue *q, struct elevator_type *e)
+ 	bfq_init_root_group(bfqd->root_group, bfqd);
+ 	bfq_init_entity(&bfqd->oom_bfqq.entity, bfqd->root_group);
+ 
+-
++	wbt_disable_default(q);
+ 	return 0;
+ 
+ out_free:
+diff --git a/block/blk-wbt.c b/block/blk-wbt.c
+index 6a9a0f03a67b..e59d59c11ebb 100644
+--- a/block/blk-wbt.c
++++ b/block/blk-wbt.c
+@@ -654,7 +654,7 @@ void wbt_set_write_cache(struct rq_wb *rwb, bool write_cache_on)
+ }
+ 
+ /*
+- * Disable wbt, if enabled by default. Only called from CFQ.
++ * Disable wbt, if enabled by default.
+  */
+ void wbt_disable_default(struct request_queue *q)
+ {
+diff --git a/crypto/lrw.c b/crypto/lrw.c
+index a8bfae4451bf..eb681e9fe574 100644
+--- a/crypto/lrw.c
++++ b/crypto/lrw.c
+@@ -610,8 +610,10 @@ static int create(struct crypto_template *tmpl, struct rtattr **tb)
+ 		ecb_name[len - 1] = 0;
+ 
+ 		if (snprintf(inst->alg.base.cra_name, CRYPTO_MAX_ALG_NAME,
+-			     "lrw(%s)", ecb_name) >= CRYPTO_MAX_ALG_NAME)
+-			return -ENAMETOOLONG;
++			     "lrw(%s)", ecb_name) >= CRYPTO_MAX_ALG_NAME) {
++			err = -ENAMETOOLONG;
++			goto err_drop_spawn;
++		}
+ 	}
+ 
+ 	inst->alg.base.cra_flags = alg->base.cra_flags & CRYPTO_ALG_ASYNC;
+diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
+index 3c3a37b8503b..572b6c7303ed 100644
+--- a/drivers/acpi/apei/ghes.c
++++ b/drivers/acpi/apei/ghes.c
+@@ -51,6 +51,7 @@
+ #include <acpi/actbl1.h>
+ #include <acpi/ghes.h>
+ #include <acpi/apei.h>
++#include <asm/fixmap.h>
+ #include <asm/tlbflush.h>
+ #include <ras/ras_event.h>
+ 
+@@ -112,7 +113,7 @@ static DEFINE_MUTEX(ghes_list_mutex);
+  * Because the memory area used to transfer hardware error information
+  * from BIOS to Linux can be determined only in NMI, IRQ or timer
+  * handler, but general ioremap can not be used in atomic context, so
+- * a special version of atomic ioremap is implemented for that.
++ * the fixmap is used instead.
+  */
+ 
+ /*
+@@ -126,8 +127,8 @@ static DEFINE_MUTEX(ghes_list_mutex);
+ /* virtual memory area for atomic ioremap */
+ static struct vm_struct *ghes_ioremap_area;
+ /*
+- * These 2 spinlock is used to prevent atomic ioremap virtual memory
+- * area from being mapped simultaneously.
++ * These 2 spinlocks are used to prevent the fixmap entries from being used
++ * simultaneously.
+  */
+ static DEFINE_RAW_SPINLOCK(ghes_ioremap_lock_nmi);
+ static DEFINE_SPINLOCK(ghes_ioremap_lock_irq);
+@@ -159,52 +160,36 @@ static void ghes_ioremap_exit(void)
+ 
+ static void __iomem *ghes_ioremap_pfn_nmi(u64 pfn)
+ {
+-	unsigned long vaddr;
+ 	phys_addr_t paddr;
+ 	pgprot_t prot;
+ 
+-	vaddr = (unsigned long)GHES_IOREMAP_NMI_PAGE(ghes_ioremap_area->addr);
+-
+ 	paddr = pfn << PAGE_SHIFT;
+ 	prot = arch_apei_get_mem_attribute(paddr);
+-	ioremap_page_range(vaddr, vaddr + PAGE_SIZE, paddr, prot);
++	__set_fixmap(FIX_APEI_GHES_NMI, paddr, prot);
+ 
+-	return (void __iomem *)vaddr;
++	return (void __iomem *) fix_to_virt(FIX_APEI_GHES_NMI);
+ }
+ 
+ static void __iomem *ghes_ioremap_pfn_irq(u64 pfn)
+ {
+-	unsigned long vaddr, paddr;
++	phys_addr_t paddr;
+ 	pgprot_t prot;
+ 
+-	vaddr = (unsigned long)GHES_IOREMAP_IRQ_PAGE(ghes_ioremap_area->addr);
+-
+ 	paddr = pfn << PAGE_SHIFT;
+ 	prot = arch_apei_get_mem_attribute(paddr);
++	__set_fixmap(FIX_APEI_GHES_IRQ, paddr, prot);
+ 
+-	ioremap_page_range(vaddr, vaddr + PAGE_SIZE, paddr, prot);
+-
+-	return (void __iomem *)vaddr;
++	return (void __iomem *) fix_to_virt(FIX_APEI_GHES_IRQ);
+ }
+ 
+-static void ghes_iounmap_nmi(void __iomem *vaddr_ptr)
++static void ghes_iounmap_nmi(void)
+ {
+-	unsigned long vaddr = (unsigned long __force)vaddr_ptr;
+-	void *base = ghes_ioremap_area->addr;
+-
+-	BUG_ON(vaddr != (unsigned long)GHES_IOREMAP_NMI_PAGE(base));
+-	unmap_kernel_range_noflush(vaddr, PAGE_SIZE);
+-	arch_apei_flush_tlb_one(vaddr);
++	clear_fixmap(FIX_APEI_GHES_NMI);
+ }
+ 
+-static void ghes_iounmap_irq(void __iomem *vaddr_ptr)
++static void ghes_iounmap_irq(void)
+ {
+-	unsigned long vaddr = (unsigned long __force)vaddr_ptr;
+-	void *base = ghes_ioremap_area->addr;
+-
+-	BUG_ON(vaddr != (unsigned long)GHES_IOREMAP_IRQ_PAGE(base));
+-	unmap_kernel_range_noflush(vaddr, PAGE_SIZE);
+-	arch_apei_flush_tlb_one(vaddr);
++	clear_fixmap(FIX_APEI_GHES_IRQ);
+ }
+ 
+ static int ghes_estatus_pool_init(void)
+@@ -360,10 +345,10 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
+ 		paddr += trunk;
+ 		buffer += trunk;
+ 		if (in_nmi) {
+-			ghes_iounmap_nmi(vaddr);
++			ghes_iounmap_nmi();
+ 			raw_spin_unlock(&ghes_ioremap_lock_nmi);
+ 		} else {
+-			ghes_iounmap_irq(vaddr);
++			ghes_iounmap_irq();
+ 			spin_unlock_irqrestore(&ghes_ioremap_lock_irq, flags);
+ 		}
+ 	}
+@@ -851,17 +836,8 @@ static void ghes_sea_remove(struct ghes *ghes)
+ 	synchronize_rcu();
+ }
+ #else /* CONFIG_ACPI_APEI_SEA */
+-static inline void ghes_sea_add(struct ghes *ghes)
+-{
+-	pr_err(GHES_PFX "ID: %d, trying to add SEA notification which is not supported\n",
+-	       ghes->generic->header.source_id);
+-}
+-
+-static inline void ghes_sea_remove(struct ghes *ghes)
+-{
+-	pr_err(GHES_PFX "ID: %d, trying to remove SEA notification which is not supported\n",
+-	       ghes->generic->header.source_id);
+-}
++static inline void ghes_sea_add(struct ghes *ghes) { }
++static inline void ghes_sea_remove(struct ghes *ghes) { }
+ #endif /* CONFIG_ACPI_APEI_SEA */
+ 
+ #ifdef CONFIG_HAVE_ACPI_APEI_NMI
+@@ -1063,23 +1039,9 @@ static void ghes_nmi_init_cxt(void)
+ 	init_irq_work(&ghes_proc_irq_work, ghes_proc_in_irq);
+ }
+ #else /* CONFIG_HAVE_ACPI_APEI_NMI */
+-static inline void ghes_nmi_add(struct ghes *ghes)
+-{
+-	pr_err(GHES_PFX "ID: %d, trying to add NMI notification which is not supported!\n",
+-	       ghes->generic->header.source_id);
+-	BUG();
+-}
+-
+-static inline void ghes_nmi_remove(struct ghes *ghes)
+-{
+-	pr_err(GHES_PFX "ID: %d, trying to remove NMI notification which is not supported!\n",
+-	       ghes->generic->header.source_id);
+-	BUG();
+-}
+-
+-static inline void ghes_nmi_init_cxt(void)
+-{
+-}
++static inline void ghes_nmi_add(struct ghes *ghes) { }
++static inline void ghes_nmi_remove(struct ghes *ghes) { }
++static inline void ghes_nmi_init_cxt(void) { }
+ #endif /* CONFIG_HAVE_ACPI_APEI_NMI */
+ 
+ static int ghes_probe(struct platform_device *ghes_dev)
+diff --git a/drivers/base/power/opp/core.c b/drivers/base/power/opp/core.c
+index a6de32530693..0459b1204694 100644
+--- a/drivers/base/power/opp/core.c
++++ b/drivers/base/power/opp/core.c
+@@ -296,7 +296,7 @@ int dev_pm_opp_get_opp_count(struct device *dev)
+ 	opp_table = _find_opp_table(dev);
+ 	if (IS_ERR(opp_table)) {
+ 		count = PTR_ERR(opp_table);
+-		dev_err(dev, "%s: OPP table not found (%d)\n",
++		dev_dbg(dev, "%s: OPP table not found (%d)\n",
+ 			__func__, count);
+ 		return count;
+ 	}
+diff --git a/drivers/bluetooth/hci_bcm.c b/drivers/bluetooth/hci_bcm.c
+index e2540113d0da..73d2d88ddc03 100644
+--- a/drivers/bluetooth/hci_bcm.c
++++ b/drivers/bluetooth/hci_bcm.c
+@@ -68,7 +68,7 @@ struct bcm_device {
+ 	u32			init_speed;
+ 	u32			oper_speed;
+ 	int			irq;
+-	u8			irq_polarity;
++	bool			irq_active_low;
+ 
+ #ifdef CONFIG_PM
+ 	struct hci_uart		*hu;
+@@ -213,7 +213,9 @@ static int bcm_request_irq(struct bcm_data *bcm)
+ 	}
+ 
+ 	err = devm_request_irq(&bdev->pdev->dev, bdev->irq, bcm_host_wake,
+-			       IRQF_TRIGGER_RISING, "host_wake", bdev);
++			       bdev->irq_active_low ? IRQF_TRIGGER_FALLING :
++						      IRQF_TRIGGER_RISING,
++			       "host_wake", bdev);
+ 	if (err)
+ 		goto unlock;
+ 
+@@ -253,7 +255,7 @@ static int bcm_setup_sleep(struct hci_uart *hu)
+ 	struct sk_buff *skb;
+ 	struct bcm_set_sleep_mode sleep_params = default_sleep_params;
+ 
+-	sleep_params.host_wake_active = !bcm->dev->irq_polarity;
++	sleep_params.host_wake_active = !bcm->dev->irq_active_low;
+ 
+ 	skb = __hci_cmd_sync(hu->hdev, 0xfc27, sizeof(sleep_params),
+ 			     &sleep_params, HCI_INIT_TIMEOUT);
+@@ -690,10 +692,8 @@ static const struct acpi_gpio_mapping acpi_bcm_int_first_gpios[] = {
+ };
+ 
+ #ifdef CONFIG_ACPI
+-static u8 acpi_active_low = ACPI_ACTIVE_LOW;
+-
+ /* IRQ polarity of some chipsets are not defined correctly in ACPI table. */
+-static const struct dmi_system_id bcm_wrong_irq_dmi_table[] = {
++static const struct dmi_system_id bcm_active_low_irq_dmi_table[] = {
+ 	{
+ 		.ident = "Asus T100TA",
+ 		.matches = {
+@@ -701,7 +701,6 @@ static const struct dmi_system_id bcm_wrong_irq_dmi_table[] = {
+ 					"ASUSTeK COMPUTER INC."),
+ 			DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "T100TA"),
+ 		},
+-		.driver_data = &acpi_active_low,
+ 	},
+ 	{
+ 		.ident = "Asus T100CHI",
+@@ -710,7 +709,6 @@ static const struct dmi_system_id bcm_wrong_irq_dmi_table[] = {
+ 					"ASUSTeK COMPUTER INC."),
+ 			DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "T100CHI"),
+ 		},
+-		.driver_data = &acpi_active_low,
+ 	},
+ 	{	/* Handle ThinkPad 8 tablets with BCM2E55 chipset ACPI ID */
+ 		.ident = "Lenovo ThinkPad 8",
+@@ -718,7 +716,6 @@ static const struct dmi_system_id bcm_wrong_irq_dmi_table[] = {
+ 			DMI_EXACT_MATCH(DMI_SYS_VENDOR, "LENOVO"),
+ 			DMI_EXACT_MATCH(DMI_PRODUCT_VERSION, "ThinkPad 8"),
+ 		},
+-		.driver_data = &acpi_active_low,
+ 	},
+ 	{ }
+ };
+@@ -733,13 +730,13 @@ static int bcm_resource(struct acpi_resource *ares, void *data)
+ 	switch (ares->type) {
+ 	case ACPI_RESOURCE_TYPE_EXTENDED_IRQ:
+ 		irq = &ares->data.extended_irq;
+-		dev->irq_polarity = irq->polarity;
++		dev->irq_active_low = irq->polarity == ACPI_ACTIVE_LOW;
+ 		break;
+ 
+ 	case ACPI_RESOURCE_TYPE_GPIO:
+ 		gpio = &ares->data.gpio;
+ 		if (gpio->connection_type == ACPI_RESOURCE_GPIO_TYPE_INT)
+-			dev->irq_polarity = gpio->polarity;
++			dev->irq_active_low = gpio->polarity == ACPI_ACTIVE_LOW;
+ 		break;
+ 
+ 	case ACPI_RESOURCE_TYPE_SERIAL_BUS:
+@@ -834,11 +831,11 @@ static int bcm_acpi_probe(struct bcm_device *dev)
+ 		return ret;
+ 	acpi_dev_free_resource_list(&resources);
+ 
+-	dmi_id = dmi_first_match(bcm_wrong_irq_dmi_table);
++	dmi_id = dmi_first_match(bcm_active_low_irq_dmi_table);
+ 	if (dmi_id) {
+ 		bt_dev_warn(dev, "%s: Overwriting IRQ polarity to active low",
+ 			    dmi_id->ident);
+-		dev->irq_polarity = *(u8 *)dmi_id->driver_data;
++		dev->irq_active_low = true;
+ 	}
+ 
+ 	return 0;
+diff --git a/drivers/bluetooth/hci_ldisc.c b/drivers/bluetooth/hci_ldisc.c
+index 6e2403805784..6aef3bde10d7 100644
+--- a/drivers/bluetooth/hci_ldisc.c
++++ b/drivers/bluetooth/hci_ldisc.c
+@@ -41,6 +41,7 @@
+ #include <linux/ioctl.h>
+ #include <linux/skbuff.h>
+ #include <linux/firmware.h>
++#include <linux/serdev.h>
+ 
+ #include <net/bluetooth/bluetooth.h>
+ #include <net/bluetooth/hci_core.h>
+@@ -298,6 +299,12 @@ void hci_uart_set_flow_control(struct hci_uart *hu, bool enable)
+ 	unsigned int set = 0;
+ 	unsigned int clear = 0;
+ 
++	if (hu->serdev) {
++		serdev_device_set_flow_control(hu->serdev, !enable);
++		serdev_device_set_rts(hu->serdev, !enable);
++		return;
++	}
++
+ 	if (enable) {
+ 		/* Disable hardware flow control */
+ 		ktermios = tty->termios;
+diff --git a/drivers/clk/sunxi-ng/ccu-sun5i.c b/drivers/clk/sunxi-ng/ccu-sun5i.c
+index ab9e850b3707..2f385a57cd91 100644
+--- a/drivers/clk/sunxi-ng/ccu-sun5i.c
++++ b/drivers/clk/sunxi-ng/ccu-sun5i.c
+@@ -982,8 +982,8 @@ static void __init sun5i_ccu_init(struct device_node *node,
+ 
+ 	/* Force the PLL-Audio-1x divider to 4 */
+ 	val = readl(reg + SUN5I_PLL_AUDIO_REG);
+-	val &= ~GENMASK(19, 16);
+-	writel(val | (3 << 16), reg + SUN5I_PLL_AUDIO_REG);
++	val &= ~GENMASK(29, 26);
++	writel(val | (3 << 26), reg + SUN5I_PLL_AUDIO_REG);
+ 
+ 	/*
+ 	 * Use the peripheral PLL as the AHB parent, instead of CPU /
+diff --git a/drivers/clk/sunxi-ng/ccu-sun6i-a31.c b/drivers/clk/sunxi-ng/ccu-sun6i-a31.c
+index 8af434815fba..241fb13f1c06 100644
+--- a/drivers/clk/sunxi-ng/ccu-sun6i-a31.c
++++ b/drivers/clk/sunxi-ng/ccu-sun6i-a31.c
+@@ -608,7 +608,7 @@ static SUNXI_CCU_M_WITH_MUX_GATE(hdmi_clk, "hdmi", lcd_ch1_parents,
+ 				 0x150, 0, 4, 24, 2, BIT(31),
+ 				 CLK_SET_RATE_PARENT);
+ 
+-static SUNXI_CCU_GATE(hdmi_ddc_clk, "hdmi-ddc", "osc24M", 0x150, BIT(30), 0);
++static SUNXI_CCU_GATE(hdmi_ddc_clk, "ddc", "osc24M", 0x150, BIT(30), 0);
+ 
+ static SUNXI_CCU_GATE(ps_clk, "ps", "lcd1-ch1", 0x140, BIT(31), 0);
+ 
+diff --git a/drivers/clk/sunxi-ng/ccu_nm.c b/drivers/clk/sunxi-ng/ccu_nm.c
+index a32158e8f2e3..84a5e7f17f6f 100644
+--- a/drivers/clk/sunxi-ng/ccu_nm.c
++++ b/drivers/clk/sunxi-ng/ccu_nm.c
+@@ -99,6 +99,9 @@ static long ccu_nm_round_rate(struct clk_hw *hw, unsigned long rate,
+ 	struct ccu_nm *nm = hw_to_ccu_nm(hw);
+ 	struct _ccu_nm _nm;
+ 
++	if (ccu_frac_helper_has_rate(&nm->common, &nm->frac, rate))
++		return rate;
++
+ 	_nm.min_n = nm->n.min ?: 1;
+ 	_nm.max_n = nm->n.max ?: 1 << nm->n.width;
+ 	_nm.min_m = 1;
+diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
+index 484cc8909d5c..ed4df58a855e 100644
+--- a/drivers/cpuidle/cpuidle.c
++++ b/drivers/cpuidle/cpuidle.c
+@@ -208,6 +208,7 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv,
+ 			return -EBUSY;
+ 		}
+ 		target_state = &drv->states[index];
++		broadcast = false;
+ 	}
+ 
+ 	/* Take note of the planned idle state. */
+diff --git a/drivers/crypto/amcc/crypto4xx_core.h b/drivers/crypto/amcc/crypto4xx_core.h
+index ecfdcfe3698d..4f41d6da5acc 100644
+--- a/drivers/crypto/amcc/crypto4xx_core.h
++++ b/drivers/crypto/amcc/crypto4xx_core.h
+@@ -34,12 +34,12 @@
+ #define PPC405EX_CE_RESET                       0x00000008
+ 
+ #define CRYPTO4XX_CRYPTO_PRIORITY		300
+-#define PPC4XX_LAST_PD				63
+-#define PPC4XX_NUM_PD				64
+-#define PPC4XX_LAST_GD				1023
++#define PPC4XX_NUM_PD				256
++#define PPC4XX_LAST_PD				(PPC4XX_NUM_PD - 1)
+ #define PPC4XX_NUM_GD				1024
+-#define PPC4XX_LAST_SD				63
+-#define PPC4XX_NUM_SD				64
++#define PPC4XX_LAST_GD				(PPC4XX_NUM_GD - 1)
++#define PPC4XX_NUM_SD				256
++#define PPC4XX_LAST_SD				(PPC4XX_NUM_SD - 1)
+ #define PPC4XX_SD_BUFFER_SIZE			2048
+ 
+ #define PD_ENTRY_INUSE				1
+diff --git a/drivers/gpu/drm/drm_dp_dual_mode_helper.c b/drivers/gpu/drm/drm_dp_dual_mode_helper.c
+index 0ef9011a1856..02a50929af67 100644
+--- a/drivers/gpu/drm/drm_dp_dual_mode_helper.c
++++ b/drivers/gpu/drm/drm_dp_dual_mode_helper.c
+@@ -410,6 +410,7 @@ int drm_lspcon_get_mode(struct i2c_adapter *adapter,
+ {
+ 	u8 data;
+ 	int ret = 0;
++	int retry;
+ 
+ 	if (!mode) {
+ 		DRM_ERROR("NULL input\n");
+@@ -417,10 +418,19 @@ int drm_lspcon_get_mode(struct i2c_adapter *adapter,
+ 	}
+ 
+ 	/* Read Status: i2c over aux */
+-	ret = drm_dp_dual_mode_read(adapter, DP_DUAL_MODE_LSPCON_CURRENT_MODE,
+-				    &data, sizeof(data));
++	for (retry = 0; retry < 6; retry++) {
++		if (retry)
++			usleep_range(500, 1000);
++
++		ret = drm_dp_dual_mode_read(adapter,
++					    DP_DUAL_MODE_LSPCON_CURRENT_MODE,
++					    &data, sizeof(data));
++		if (!ret)
++			break;
++	}
++
+ 	if (ret < 0) {
+-		DRM_ERROR("LSPCON read(0x80, 0x41) failed\n");
++		DRM_DEBUG_KMS("LSPCON read(0x80, 0x41) failed\n");
+ 		return -EFAULT;
+ 	}
+ 
+diff --git a/drivers/gpu/drm/vc4/vc4_dsi.c b/drivers/gpu/drm/vc4/vc4_dsi.c
+index d1e0dc908048..04796d7d0fdb 100644
+--- a/drivers/gpu/drm/vc4/vc4_dsi.c
++++ b/drivers/gpu/drm/vc4/vc4_dsi.c
+@@ -866,7 +866,8 @@ static bool vc4_dsi_encoder_mode_fixup(struct drm_encoder *encoder,
+ 	adjusted_mode->clock = pixel_clock_hz / 1000 + 1;
+ 
+ 	/* Given the new pixel clock, adjust HFP to keep vrefresh the same. */
+-	adjusted_mode->htotal = pixel_clock_hz / (mode->vrefresh * mode->vtotal);
++	adjusted_mode->htotal = adjusted_mode->clock * mode->htotal /
++				mode->clock;
+ 	adjusted_mode->hsync_end += adjusted_mode->htotal - mode->htotal;
+ 	adjusted_mode->hsync_start += adjusted_mode->htotal - mode->htotal;
+ 
+diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
+index 937801ac2fe0..2cd134dd94d2 100644
+--- a/drivers/hv/vmbus_drv.c
++++ b/drivers/hv/vmbus_drv.c
+@@ -1534,7 +1534,7 @@ static int __init hv_acpi_init(void)
+ {
+ 	int ret, t;
+ 
+-	if (x86_hyper != &x86_hyper_ms_hyperv)
++	if (x86_hyper_type != X86_HYPER_MS_HYPERV)
+ 		return -ENODEV;
+ 
+ 	init_completion(&probe_event);
+diff --git a/drivers/iio/accel/st_accel_core.c b/drivers/iio/accel/st_accel_core.c
+index 752856b3a849..379de1829cdb 100644
+--- a/drivers/iio/accel/st_accel_core.c
++++ b/drivers/iio/accel/st_accel_core.c
+@@ -164,7 +164,10 @@ static const struct st_sensor_settings st_accel_sensors_settings[] = {
+ 			.mask_int2 = 0x00,
+ 			.addr_ihl = 0x25,
+ 			.mask_ihl = 0x02,
+-			.addr_stat_drdy = ST_SENSORS_DEFAULT_STAT_ADDR,
++			.stat_drdy = {
++				.addr = ST_SENSORS_DEFAULT_STAT_ADDR,
++				.mask = 0x07,
++			},
+ 		},
+ 		.sim = {
+ 			.addr = 0x23,
+@@ -236,7 +239,10 @@ static const struct st_sensor_settings st_accel_sensors_settings[] = {
+ 			.mask_ihl = 0x80,
+ 			.addr_od = 0x22,
+ 			.mask_od = 0x40,
+-			.addr_stat_drdy = ST_SENSORS_DEFAULT_STAT_ADDR,
++			.stat_drdy = {
++				.addr = ST_SENSORS_DEFAULT_STAT_ADDR,
++				.mask = 0x07,
++			},
+ 		},
+ 		.sim = {
+ 			.addr = 0x23,
+@@ -318,7 +324,10 @@ static const struct st_sensor_settings st_accel_sensors_settings[] = {
+ 			.mask_int2 = 0x00,
+ 			.addr_ihl = 0x23,
+ 			.mask_ihl = 0x40,
+-			.addr_stat_drdy = ST_SENSORS_DEFAULT_STAT_ADDR,
++			.stat_drdy = {
++				.addr = ST_SENSORS_DEFAULT_STAT_ADDR,
++				.mask = 0x07,
++			},
+ 			.ig1 = {
+ 				.en_addr = 0x23,
+ 				.en_mask = 0x08,
+@@ -389,7 +398,10 @@ static const struct st_sensor_settings st_accel_sensors_settings[] = {
+ 		.drdy_irq = {
+ 			.addr = 0x21,
+ 			.mask_int1 = 0x04,
+-			.addr_stat_drdy = ST_SENSORS_DEFAULT_STAT_ADDR,
++			.stat_drdy = {
++				.addr = ST_SENSORS_DEFAULT_STAT_ADDR,
++				.mask = 0x07,
++			},
+ 		},
+ 		.sim = {
+ 			.addr = 0x21,
+@@ -451,7 +463,10 @@ static const struct st_sensor_settings st_accel_sensors_settings[] = {
+ 			.mask_ihl = 0x80,
+ 			.addr_od = 0x22,
+ 			.mask_od = 0x40,
+-			.addr_stat_drdy = ST_SENSORS_DEFAULT_STAT_ADDR,
++			.stat_drdy = {
++				.addr = ST_SENSORS_DEFAULT_STAT_ADDR,
++				.mask = 0x07,
++			},
+ 		},
+ 		.sim = {
+ 			.addr = 0x21,
+@@ -569,7 +584,10 @@ static const struct st_sensor_settings st_accel_sensors_settings[] = {
+ 		.drdy_irq = {
+ 			.addr = 0x21,
+ 			.mask_int1 = 0x04,
+-			.addr_stat_drdy = ST_SENSORS_DEFAULT_STAT_ADDR,
++			.stat_drdy = {
++				.addr = ST_SENSORS_DEFAULT_STAT_ADDR,
++				.mask = 0x07,
++			},
+ 		},
+ 		.sim = {
+ 			.addr = 0x21,
+@@ -640,7 +658,10 @@ static const struct st_sensor_settings st_accel_sensors_settings[] = {
+ 			.mask_int2 = 0x00,
+ 			.addr_ihl = 0x25,
+ 			.mask_ihl = 0x02,
+-			.addr_stat_drdy = ST_SENSORS_DEFAULT_STAT_ADDR,
++			.stat_drdy = {
++				.addr = ST_SENSORS_DEFAULT_STAT_ADDR,
++				.mask = 0x07,
++			},
+ 		},
+ 		.sim = {
+ 			.addr = 0x23,
+diff --git a/drivers/iio/common/st_sensors/st_sensors_core.c b/drivers/iio/common/st_sensors/st_sensors_core.c
+index 02e833b14db0..34115f05d5c4 100644
+--- a/drivers/iio/common/st_sensors/st_sensors_core.c
++++ b/drivers/iio/common/st_sensors/st_sensors_core.c
+@@ -470,7 +470,7 @@ int st_sensors_set_dataready_irq(struct iio_dev *indio_dev, bool enable)
+ 		 * different one. Take into account irq status register
+ 		 * to understand if irq trigger can be properly supported
+ 		 */
+-		if (sdata->sensor_settings->drdy_irq.addr_stat_drdy)
++		if (sdata->sensor_settings->drdy_irq.stat_drdy.addr)
+ 			sdata->hw_irq_trigger = enable;
+ 		return 0;
+ 	}
+diff --git a/drivers/iio/common/st_sensors/st_sensors_trigger.c b/drivers/iio/common/st_sensors/st_sensors_trigger.c
+index fa73e6795359..fdcc5a891958 100644
+--- a/drivers/iio/common/st_sensors/st_sensors_trigger.c
++++ b/drivers/iio/common/st_sensors/st_sensors_trigger.c
+@@ -31,7 +31,7 @@ static int st_sensors_new_samples_available(struct iio_dev *indio_dev,
+ 	int ret;
+ 
+ 	/* How would I know if I can't check it? */
+-	if (!sdata->sensor_settings->drdy_irq.addr_stat_drdy)
++	if (!sdata->sensor_settings->drdy_irq.stat_drdy.addr)
+ 		return -EINVAL;
+ 
+ 	/* No scan mask, no interrupt */
+@@ -39,23 +39,15 @@ static int st_sensors_new_samples_available(struct iio_dev *indio_dev,
+ 		return 0;
+ 
+ 	ret = sdata->tf->read_byte(&sdata->tb, sdata->dev,
+-			sdata->sensor_settings->drdy_irq.addr_stat_drdy,
++			sdata->sensor_settings->drdy_irq.stat_drdy.addr,
+ 			&status);
+ 	if (ret < 0) {
+ 		dev_err(sdata->dev,
+ 			"error checking samples available\n");
+ 		return ret;
+ 	}
+-	/*
+-	 * the lower bits of .active_scan_mask[0] is directly mapped
+-	 * to the channels on the sensor: either bit 0 for
+-	 * one-dimensional sensors, or e.g. x,y,z for accelerometers,
+-	 * gyroscopes or magnetometers. No sensor use more than 3
+-	 * channels, so cut the other status bits here.
+-	 */
+-	status &= 0x07;
+ 
+-	if (status & (u8)indio_dev->active_scan_mask[0])
++	if (status & sdata->sensor_settings->drdy_irq.stat_drdy.mask)
+ 		return 1;
+ 
+ 	return 0;
+@@ -212,7 +204,7 @@ int st_sensors_allocate_trigger(struct iio_dev *indio_dev,
+ 	 * it was "our" interrupt.
+ 	 */
+ 	if (sdata->int_pin_open_drain &&
+-	    sdata->sensor_settings->drdy_irq.addr_stat_drdy)
++	    sdata->sensor_settings->drdy_irq.stat_drdy.addr)
+ 		irq_trig |= IRQF_SHARED;
+ 
+ 	err = request_threaded_irq(sdata->get_irq_data_ready(indio_dev),
+diff --git a/drivers/iio/gyro/st_gyro_core.c b/drivers/iio/gyro/st_gyro_core.c
+index e366422e8512..2536a8400c98 100644
+--- a/drivers/iio/gyro/st_gyro_core.c
++++ b/drivers/iio/gyro/st_gyro_core.c
+@@ -118,7 +118,10 @@ static const struct st_sensor_settings st_gyro_sensors_settings[] = {
+ 			 * drain settings, but only for INT1 and not
+ 			 * for the DRDY line on INT2.
+ 			 */
+-			.addr_stat_drdy = ST_SENSORS_DEFAULT_STAT_ADDR,
++			.stat_drdy = {
++				.addr = ST_SENSORS_DEFAULT_STAT_ADDR,
++				.mask = 0x07,
++			},
+ 		},
+ 		.multi_read_bit = true,
+ 		.bootime = 2,
+@@ -188,7 +191,10 @@ static const struct st_sensor_settings st_gyro_sensors_settings[] = {
+ 			 * drain settings, but only for INT1 and not
+ 			 * for the DRDY line on INT2.
+ 			 */
+-			.addr_stat_drdy = ST_SENSORS_DEFAULT_STAT_ADDR,
++			.stat_drdy = {
++				.addr = ST_SENSORS_DEFAULT_STAT_ADDR,
++				.mask = 0x07,
++			},
+ 		},
+ 		.multi_read_bit = true,
+ 		.bootime = 2,
+@@ -253,7 +259,10 @@ static const struct st_sensor_settings st_gyro_sensors_settings[] = {
+ 			 * drain settings, but only for INT1 and not
+ 			 * for the DRDY line on INT2.
+ 			 */
+-			.addr_stat_drdy = ST_SENSORS_DEFAULT_STAT_ADDR,
++			.stat_drdy = {
++				.addr = ST_SENSORS_DEFAULT_STAT_ADDR,
++				.mask = 0x07,
++			},
+ 		},
+ 		.multi_read_bit = true,
+ 		.bootime = 2,
+diff --git a/drivers/iio/magnetometer/st_magn_core.c b/drivers/iio/magnetometer/st_magn_core.c
+index 08aafba4481c..19031a7bce23 100644
+--- a/drivers/iio/magnetometer/st_magn_core.c
++++ b/drivers/iio/magnetometer/st_magn_core.c
+@@ -317,7 +317,10 @@ static const struct st_sensor_settings st_magn_sensors_settings[] = {
+ 		},
+ 		.drdy_irq = {
+ 			/* drdy line is routed drdy pin */
+-			.addr_stat_drdy = ST_SENSORS_DEFAULT_STAT_ADDR,
++			.stat_drdy = {
++				.addr = ST_SENSORS_DEFAULT_STAT_ADDR,
++				.mask = 0x07,
++			},
+ 		},
+ 		.multi_read_bit = true,
+ 		.bootime = 2,
+@@ -361,7 +364,10 @@ static const struct st_sensor_settings st_magn_sensors_settings[] = {
+ 		.drdy_irq = {
+ 			.addr = 0x62,
+ 			.mask_int1 = 0x01,
+-			.addr_stat_drdy = 0x67,
++			.stat_drdy = {
++				.addr = 0x67,
++				.mask = 0x07,
++			},
+ 		},
+ 		.multi_read_bit = false,
+ 		.bootime = 2,
+diff --git a/drivers/iio/pressure/st_pressure_core.c b/drivers/iio/pressure/st_pressure_core.c
+index 34611a8ea2ce..ea075fcd5a6f 100644
+--- a/drivers/iio/pressure/st_pressure_core.c
++++ b/drivers/iio/pressure/st_pressure_core.c
+@@ -287,7 +287,10 @@ static const struct st_sensor_settings st_press_sensors_settings[] = {
+ 			.mask_ihl = 0x80,
+ 			.addr_od = 0x22,
+ 			.mask_od = 0x40,
+-			.addr_stat_drdy = ST_SENSORS_DEFAULT_STAT_ADDR,
++			.stat_drdy = {
++				.addr = ST_SENSORS_DEFAULT_STAT_ADDR,
++				.mask = 0x03,
++			},
+ 		},
+ 		.multi_read_bit = true,
+ 		.bootime = 2,
+@@ -395,7 +398,10 @@ static const struct st_sensor_settings st_press_sensors_settings[] = {
+ 			.mask_ihl = 0x80,
+ 			.addr_od = 0x22,
+ 			.mask_od = 0x40,
+-			.addr_stat_drdy = ST_SENSORS_DEFAULT_STAT_ADDR,
++			.stat_drdy = {
++				.addr = ST_SENSORS_DEFAULT_STAT_ADDR,
++				.mask = 0x03,
++			},
+ 		},
+ 		.multi_read_bit = true,
+ 		.bootime = 2,
+@@ -454,7 +460,10 @@ static const struct st_sensor_settings st_press_sensors_settings[] = {
+ 			.mask_ihl = 0x80,
+ 			.addr_od = 0x12,
+ 			.mask_od = 0x40,
+-			.addr_stat_drdy = ST_SENSORS_DEFAULT_STAT_ADDR,
++			.stat_drdy = {
++				.addr = ST_SENSORS_DEFAULT_STAT_ADDR,
++				.mask = 0x03,
++			},
+ 		},
+ 		.multi_read_bit = false,
+ 		.bootime = 2,
+diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v1.c b/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
+index 747efd1ae5a6..8208c30f03c5 100644
+--- a/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
++++ b/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
+@@ -1001,6 +1001,11 @@ static void hns_roce_v1_mr_free_work_fn(struct work_struct *work)
+ 		}
+ 	}
+ 
++	if (!ne) {
++		dev_err(dev, "Reseved loop qp is absent!\n");
++		goto free_work;
++	}
++
+ 	do {
+ 		ret = hns_roce_v1_poll_cq(&mr_free_cq->ib_cq, ne, wc);
+ 		if (ret < 0) {
+diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c
+index c1b5f38f31a5..3b4916680018 100644
+--- a/drivers/infiniband/sw/rxe/rxe_pool.c
++++ b/drivers/infiniband/sw/rxe/rxe_pool.c
+@@ -404,6 +404,8 @@ void *rxe_alloc(struct rxe_pool *pool)
+ 	elem = kmem_cache_zalloc(pool_cache(pool),
+ 				 (pool->flags & RXE_POOL_ATOMIC) ?
+ 				 GFP_ATOMIC : GFP_KERNEL);
++	if (!elem)
++		return NULL;
+ 
+ 	elem->pool = pool;
+ 	kref_init(&elem->ref_cnt);
+diff --git a/drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.c b/drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.c
+index afa938bd26d6..a72278e9cd27 100644
+--- a/drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.c
++++ b/drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.c
+@@ -139,6 +139,7 @@ void opa_vnic_release_mac_tbl(struct opa_vnic_adapter *adapter)
+ 	rcu_assign_pointer(adapter->mactbl, NULL);
+ 	synchronize_rcu();
+ 	opa_vnic_free_mac_tbl(mactbl);
++	adapter->info.vport.mac_tbl_digest = 0;
+ 	mutex_unlock(&adapter->mactbl_lock);
+ }
+ 
+diff --git a/drivers/infiniband/ulp/opa_vnic/opa_vnic_vema_iface.c b/drivers/infiniband/ulp/opa_vnic/opa_vnic_vema_iface.c
+index c2733964379c..9655cc3aa3a0 100644
+--- a/drivers/infiniband/ulp/opa_vnic/opa_vnic_vema_iface.c
++++ b/drivers/infiniband/ulp/opa_vnic/opa_vnic_vema_iface.c
+@@ -348,7 +348,7 @@ void opa_vnic_query_mcast_macs(struct opa_vnic_adapter *adapter,
+ void opa_vnic_query_ucast_macs(struct opa_vnic_adapter *adapter,
+ 			       struct opa_veswport_iface_macs *macs)
+ {
+-	u16 start_idx, tot_macs, num_macs, idx = 0, count = 0;
++	u16 start_idx, tot_macs, num_macs, idx = 0, count = 0, em_macs = 0;
+ 	struct netdev_hw_addr *ha;
+ 
+ 	start_idx = be16_to_cpu(macs->start_idx);
+@@ -359,8 +359,10 @@ void opa_vnic_query_ucast_macs(struct opa_vnic_adapter *adapter,
+ 
+ 		/* Do not include EM specified MAC address */
+ 		if (!memcmp(adapter->info.vport.base_mac_addr, ha->addr,
+-			    ARRAY_SIZE(adapter->info.vport.base_mac_addr)))
++			    ARRAY_SIZE(adapter->info.vport.base_mac_addr))) {
++			em_macs++;
+ 			continue;
++		}
+ 
+ 		if (start_idx > idx++)
+ 			continue;
+@@ -383,7 +385,7 @@ void opa_vnic_query_ucast_macs(struct opa_vnic_adapter *adapter,
+ 	}
+ 
+ 	tot_macs = netdev_hw_addr_list_count(&adapter->netdev->dev_addrs) +
+-		   netdev_uc_count(adapter->netdev);
++		   netdev_uc_count(adapter->netdev) - em_macs;
+ 	macs->tot_macs_in_lst = cpu_to_be16(tot_macs);
+ 	macs->num_macs_in_msg = cpu_to_be16(count);
+ 	macs->gen_count = cpu_to_be16(adapter->info.vport.uc_macs_gen_count);
+diff --git a/drivers/input/mouse/vmmouse.c b/drivers/input/mouse/vmmouse.c
+index 0f586780ceb4..1ae5c1ef3f5b 100644
+--- a/drivers/input/mouse/vmmouse.c
++++ b/drivers/input/mouse/vmmouse.c
+@@ -316,11 +316,9 @@ static int vmmouse_enable(struct psmouse *psmouse)
+ /*
+  * Array of supported hypervisors.
+  */
+-static const struct hypervisor_x86 *vmmouse_supported_hypervisors[] = {
+-	&x86_hyper_vmware,
+-#ifdef CONFIG_KVM_GUEST
+-	&x86_hyper_kvm,
+-#endif
++static enum x86_hypervisor_type vmmouse_supported_hypervisors[] = {
++	X86_HYPER_VMWARE,
++	X86_HYPER_KVM,
+ };
+ 
+ /**
+@@ -331,7 +329,7 @@ static bool vmmouse_check_hypervisor(void)
+ 	int i;
+ 
+ 	for (i = 0; i < ARRAY_SIZE(vmmouse_supported_hypervisors); i++)
+-		if (vmmouse_supported_hypervisors[i] == x86_hyper)
++		if (vmmouse_supported_hypervisors[i] == x86_hyper_type)
+ 			return true;
+ 
+ 	return false;
+diff --git a/drivers/leds/leds-pca955x.c b/drivers/leds/leds-pca955x.c
+index 905729191d3e..78183f90820e 100644
+--- a/drivers/leds/leds-pca955x.c
++++ b/drivers/leds/leds-pca955x.c
+@@ -61,6 +61,10 @@
+ #define PCA955X_LS_BLINK0	0x2	/* Blink at PWM0 rate */
+ #define PCA955X_LS_BLINK1	0x3	/* Blink at PWM1 rate */
+ 
++#define PCA955X_GPIO_INPUT	LED_OFF
++#define PCA955X_GPIO_HIGH	LED_OFF
++#define PCA955X_GPIO_LOW	LED_FULL
++
+ enum pca955x_type {
+ 	pca9550,
+ 	pca9551,
+@@ -329,9 +333,9 @@ static int pca955x_set_value(struct gpio_chip *gc, unsigned int offset,
+ 	struct pca955x_led *led = &pca955x->leds[offset];
+ 
+ 	if (val)
+-		return pca955x_led_set(&led->led_cdev, LED_FULL);
+-	else
+-		return pca955x_led_set(&led->led_cdev, LED_OFF);
++		return pca955x_led_set(&led->led_cdev, PCA955X_GPIO_HIGH);
++
++	return pca955x_led_set(&led->led_cdev, PCA955X_GPIO_LOW);
+ }
+ 
+ static void pca955x_gpio_set_value(struct gpio_chip *gc, unsigned int offset,
+@@ -355,8 +359,11 @@ static int pca955x_gpio_get_value(struct gpio_chip *gc, unsigned int offset)
+ static int pca955x_gpio_direction_input(struct gpio_chip *gc,
+ 					unsigned int offset)
+ {
+-	/* To use as input ensure pin is not driven */
+-	return pca955x_set_value(gc, offset, 0);
++	struct pca955x *pca955x = gpiochip_get_data(gc);
++	struct pca955x_led *led = &pca955x->leds[offset];
++
++	/* To use as input ensure pin is not driven. */
++	return pca955x_led_set(&led->led_cdev, PCA955X_GPIO_INPUT);
+ }
+ 
+ static int pca955x_gpio_direction_output(struct gpio_chip *gc,
+diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
+index 35e82b14ded7..ddf0a4341ae2 100644
+--- a/drivers/md/dm-mpath.c
++++ b/drivers/md/dm-mpath.c
+@@ -366,7 +366,7 @@ static struct pgpath *choose_path_in_pg(struct multipath *m,
+ 
+ 	pgpath = path_to_pgpath(path);
+ 
+-	if (unlikely(lockless_dereference(m->current_pg) != pg)) {
++	if (unlikely(READ_ONCE(m->current_pg) != pg)) {
+ 		/* Only update current_pgpath if pg changed */
+ 		spin_lock_irqsave(&m->lock, flags);
+ 		m->current_pgpath = pgpath;
+@@ -390,7 +390,7 @@ static struct pgpath *choose_pgpath(struct multipath *m, size_t nr_bytes)
+ 	}
+ 
+ 	/* Were we instructed to switch PG? */
+-	if (lockless_dereference(m->next_pg)) {
++	if (READ_ONCE(m->next_pg)) {
+ 		spin_lock_irqsave(&m->lock, flags);
+ 		pg = m->next_pg;
+ 		if (!pg) {
+@@ -406,7 +406,7 @@ static struct pgpath *choose_pgpath(struct multipath *m, size_t nr_bytes)
+ 
+ 	/* Don't change PG until it has no remaining paths */
+ check_current_pg:
+-	pg = lockless_dereference(m->current_pg);
++	pg = READ_ONCE(m->current_pg);
+ 	if (pg) {
+ 		pgpath = choose_path_in_pg(m, pg, nr_bytes);
+ 		if (!IS_ERR_OR_NULL(pgpath))
+@@ -473,7 +473,7 @@ static int multipath_clone_and_map(struct dm_target *ti, struct request *rq,
+ 	struct request *clone;
+ 
+ 	/* Do we need to select a new pgpath? */
+-	pgpath = lockless_dereference(m->current_pgpath);
++	pgpath = READ_ONCE(m->current_pgpath);
+ 	if (!pgpath || !test_bit(MPATHF_QUEUE_IO, &m->flags))
+ 		pgpath = choose_pgpath(m, nr_bytes);
+ 
+@@ -533,7 +533,7 @@ static int __multipath_map_bio(struct multipath *m, struct bio *bio, struct dm_m
+ 	bool queue_io;
+ 
+ 	/* Do we need to select a new pgpath? */
+-	pgpath = lockless_dereference(m->current_pgpath);
++	pgpath = READ_ONCE(m->current_pgpath);
+ 	queue_io = test_bit(MPATHF_QUEUE_IO, &m->flags);
+ 	if (!pgpath || !queue_io)
+ 		pgpath = choose_pgpath(m, nr_bytes);
+@@ -1802,7 +1802,7 @@ static int multipath_prepare_ioctl(struct dm_target *ti,
+ 	struct pgpath *current_pgpath;
+ 	int r;
+ 
+-	current_pgpath = lockless_dereference(m->current_pgpath);
++	current_pgpath = READ_ONCE(m->current_pgpath);
+ 	if (!current_pgpath)
+ 		current_pgpath = choose_pgpath(m, 0);
+ 
+@@ -1824,7 +1824,7 @@ static int multipath_prepare_ioctl(struct dm_target *ti,
+ 	}
+ 
+ 	if (r == -ENOTCONN) {
+-		if (!lockless_dereference(m->current_pg)) {
++		if (!READ_ONCE(m->current_pg)) {
+ 			/* Path status changed, redo selection */
+ 			(void) choose_pgpath(m, 0);
+ 		}
+@@ -1893,9 +1893,9 @@ static int multipath_busy(struct dm_target *ti)
+ 		return (m->queue_mode != DM_TYPE_MQ_REQUEST_BASED);
+ 
+ 	/* Guess which priority_group will be used at next mapping time */
+-	pg = lockless_dereference(m->current_pg);
+-	next_pg = lockless_dereference(m->next_pg);
+-	if (unlikely(!lockless_dereference(m->current_pgpath) && next_pg))
++	pg = READ_ONCE(m->current_pg);
++	next_pg = READ_ONCE(m->next_pg);
++	if (unlikely(!READ_ONCE(m->current_pgpath) && next_pg))
+ 		pg = next_pg;
+ 
+ 	if (!pg) {
+diff --git a/drivers/md/md.c b/drivers/md/md.c
+index 98ea86309ceb..6bf093cef958 100644
+--- a/drivers/md/md.c
++++ b/drivers/md/md.c
+@@ -7468,8 +7468,8 @@ void md_wakeup_thread(struct md_thread *thread)
+ {
+ 	if (thread) {
+ 		pr_debug("md: waking up MD thread %s.\n", thread->tsk->comm);
+-		if (!test_and_set_bit(THREAD_WAKEUP, &thread->flags))
+-			wake_up(&thread->wqueue);
++		set_bit(THREAD_WAKEUP, &thread->flags);
++		wake_up(&thread->wqueue);
+ 	}
+ }
+ EXPORT_SYMBOL(md_wakeup_thread);
+diff --git a/drivers/misc/pti.c b/drivers/misc/pti.c
+index eda38cbe8530..41f2a9f6851d 100644
+--- a/drivers/misc/pti.c
++++ b/drivers/misc/pti.c
+@@ -32,7 +32,7 @@
+ #include <linux/pci.h>
+ #include <linux/mutex.h>
+ #include <linux/miscdevice.h>
+-#include <linux/pti.h>
++#include <linux/intel-pti.h>
+ #include <linux/slab.h>
+ #include <linux/uaccess.h>
+ 
+diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
+index 1e688bfec567..9047c0a529b2 100644
+--- a/drivers/misc/vmw_balloon.c
++++ b/drivers/misc/vmw_balloon.c
+@@ -1271,7 +1271,7 @@ static int __init vmballoon_init(void)
+ 	 * Check if we are running on VMware's hypervisor and bail out
+ 	 * if we are not.
+ 	 */
+-	if (x86_hyper != &x86_hyper_vmware)
++	if (x86_hyper_type != X86_HYPER_VMWARE)
+ 		return -ENODEV;
+ 
+ 	for (is_2m_pages = 0; is_2m_pages < VMW_BALLOON_NUM_PAGE_SIZES;
+diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
+index c66abd476023..3b0db01ead1f 100644
+--- a/drivers/net/ethernet/ibm/ibmvnic.c
++++ b/drivers/net/ethernet/ibm/ibmvnic.c
+@@ -927,6 +927,7 @@ static int ibmvnic_open(struct net_device *netdev)
+ 	}
+ 
+ 	rc = __ibmvnic_open(netdev);
++	netif_carrier_on(netdev);
+ 	mutex_unlock(&adapter->reset_lock);
+ 
+ 	return rc;
+@@ -3899,6 +3900,7 @@ static int ibmvnic_probe(struct vio_dev *dev, const struct vio_device_id *id)
+ 	if (rc)
+ 		goto ibmvnic_init_fail;
+ 
++	netif_carrier_off(netdev);
+ 	rc = register_netdev(netdev);
+ 	if (rc) {
+ 		dev_err(&dev->dev, "failed to register netdev rc=%d\n", rc);
+diff --git a/drivers/net/ethernet/intel/fm10k/fm10k.h b/drivers/net/ethernet/intel/fm10k/fm10k.h
+index 689c413b7782..d2f9a2dd76a2 100644
+--- a/drivers/net/ethernet/intel/fm10k/fm10k.h
++++ b/drivers/net/ethernet/intel/fm10k/fm10k.h
+@@ -526,8 +526,8 @@ s32 fm10k_iov_update_pvid(struct fm10k_intfc *interface, u16 glort, u16 pvid);
+ int fm10k_ndo_set_vf_mac(struct net_device *netdev, int vf_idx, u8 *mac);
+ int fm10k_ndo_set_vf_vlan(struct net_device *netdev,
+ 			  int vf_idx, u16 vid, u8 qos, __be16 vlan_proto);
+-int fm10k_ndo_set_vf_bw(struct net_device *netdev, int vf_idx, int rate,
+-			int unused);
++int fm10k_ndo_set_vf_bw(struct net_device *netdev, int vf_idx,
++			int __always_unused min_rate, int max_rate);
+ int fm10k_ndo_get_vf_config(struct net_device *netdev,
+ 			    int vf_idx, struct ifla_vf_info *ivi);
+ 
+diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_iov.c b/drivers/net/ethernet/intel/fm10k/fm10k_iov.c
+index 5f4dac0d36ef..e72fd52bacfe 100644
+--- a/drivers/net/ethernet/intel/fm10k/fm10k_iov.c
++++ b/drivers/net/ethernet/intel/fm10k/fm10k_iov.c
+@@ -126,6 +126,9 @@ s32 fm10k_iov_mbx(struct fm10k_intfc *interface)
+ 		struct fm10k_mbx_info *mbx = &vf_info->mbx;
+ 		u16 glort = vf_info->glort;
+ 
++		/* process the SM mailbox first to drain outgoing messages */
++		hw->mbx.ops.process(hw, &hw->mbx);
++
+ 		/* verify port mapping is valid, if not reset port */
+ 		if (vf_info->vf_flags && !fm10k_glort_valid_pf(hw, glort))
+ 			hw->iov.ops.reset_lport(hw, vf_info);
+@@ -482,7 +485,7 @@ int fm10k_ndo_set_vf_vlan(struct net_device *netdev, int vf_idx, u16 vid,
+ }
+ 
+ int fm10k_ndo_set_vf_bw(struct net_device *netdev, int vf_idx,
+-			int __always_unused unused, int rate)
++			int __always_unused min_rate, int max_rate)
+ {
+ 	struct fm10k_intfc *interface = netdev_priv(netdev);
+ 	struct fm10k_iov_data *iov_data = interface->iov_data;
+@@ -493,14 +496,15 @@ int fm10k_ndo_set_vf_bw(struct net_device *netdev, int vf_idx,
+ 		return -EINVAL;
+ 
+ 	/* rate limit cannot be less than 10Mbs or greater than link speed */
+-	if (rate && ((rate < FM10K_VF_TC_MIN) || rate > FM10K_VF_TC_MAX))
++	if (max_rate &&
++	    (max_rate < FM10K_VF_TC_MIN || max_rate > FM10K_VF_TC_MAX))
+ 		return -EINVAL;
+ 
+ 	/* store values */
+-	iov_data->vf_info[vf_idx].rate = rate;
++	iov_data->vf_info[vf_idx].rate = max_rate;
+ 
+ 	/* update hardware configuration */
+-	hw->iov.ops.configure_tc(hw, vf_idx, rate);
++	hw->iov.ops.configure_tc(hw, vf_idx, max_rate);
+ 
+ 	return 0;
+ }
+diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
+index ea20aacd5e1d..b2cde9b16d82 100644
+--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
++++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
+@@ -2874,14 +2874,15 @@ static void i40e_vsi_free_rx_resources(struct i40e_vsi *vsi)
+ static void i40e_config_xps_tx_ring(struct i40e_ring *ring)
+ {
+ 	struct i40e_vsi *vsi = ring->vsi;
++	int cpu;
+ 
+ 	if (!ring->q_vector || !ring->netdev)
+ 		return;
+ 
+ 	if ((vsi->tc_config.numtc <= 1) &&
+ 	    !test_and_set_bit(__I40E_TX_XPS_INIT_DONE, &ring->state)) {
+-		netif_set_xps_queue(ring->netdev,
+-				    get_cpu_mask(ring->q_vector->v_idx),
++		cpu = cpumask_local_spread(ring->q_vector->v_idx, -1);
++		netif_set_xps_queue(ring->netdev, get_cpu_mask(cpu),
+ 				    ring->queue_index);
+ 	}
+ 
+@@ -3471,6 +3472,7 @@ static int i40e_vsi_request_irq_msix(struct i40e_vsi *vsi, char *basename)
+ 	int tx_int_idx = 0;
+ 	int vector, err;
+ 	int irq_num;
++	int cpu;
+ 
+ 	for (vector = 0; vector < q_vectors; vector++) {
+ 		struct i40e_q_vector *q_vector = vsi->q_vectors[vector];
+@@ -3506,10 +3508,14 @@ static int i40e_vsi_request_irq_msix(struct i40e_vsi *vsi, char *basename)
+ 		q_vector->affinity_notify.notify = i40e_irq_affinity_notify;
+ 		q_vector->affinity_notify.release = i40e_irq_affinity_release;
+ 		irq_set_affinity_notifier(irq_num, &q_vector->affinity_notify);
+-		/* get_cpu_mask returns a static constant mask with
+-		 * a permanent lifetime so it's ok to use here.
++		/* Spread affinity hints out across online CPUs.
++		 *
++		 * get_cpu_mask returns a static constant mask with
++		 * a permanent lifetime so it's ok to pass to
++		 * irq_set_affinity_hint without making a copy.
+ 		 */
+-		irq_set_affinity_hint(irq_num, get_cpu_mask(q_vector->v_idx));
++		cpu = cpumask_local_spread(q_vector->v_idx, -1);
++		irq_set_affinity_hint(irq_num, get_cpu_mask(cpu));
+ 	}
+ 
+ 	vsi->irqs_ready = true;
+diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+index 4d1e670f490e..e368b0237a1b 100644
+--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
++++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+@@ -1008,8 +1008,8 @@ static void i40e_cleanup_reset_vf(struct i40e_vf *vf)
+ 		set_bit(I40E_VF_STATE_ACTIVE, &vf->vf_states);
+ 		clear_bit(I40E_VF_STATE_DISABLED, &vf->vf_states);
+ 		/* Do not notify the client during VF init */
+-		if (test_and_clear_bit(I40E_VF_STATE_PRE_ENABLE,
+-				       &vf->vf_states))
++		if (!test_and_clear_bit(I40E_VF_STATE_PRE_ENABLE,
++					&vf->vf_states))
+ 			i40e_notify_client_of_vf_reset(pf, abs_vf_id);
+ 		vf->num_vlan = 0;
+ 	}
+@@ -2779,6 +2779,7 @@ int i40e_ndo_set_vf_mac(struct net_device *netdev, int vf_id, u8 *mac)
+ 	struct i40e_mac_filter *f;
+ 	struct i40e_vf *vf;
+ 	int ret = 0;
++	struct hlist_node *h;
+ 	int bkt;
+ 
+ 	/* validate the request */
+@@ -2817,7 +2818,7 @@ int i40e_ndo_set_vf_mac(struct net_device *netdev, int vf_id, u8 *mac)
+ 	/* Delete all the filters for this VSI - we're going to kill it
+ 	 * anyway.
+ 	 */
+-	hash_for_each(vsi->mac_filter_hash, bkt, f, hlist)
++	hash_for_each_safe(vsi->mac_filter_hash, bkt, h, f, hlist)
+ 		__i40e_del_filter(vsi, f);
+ 
+ 	spin_unlock_bh(&vsi->mac_filter_hash_lock);
+diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+index 1825d956bb00..1ccad6f30ebf 100644
+--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
++++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+@@ -546,6 +546,7 @@ i40evf_request_traffic_irqs(struct i40evf_adapter *adapter, char *basename)
+ 	unsigned int vector, q_vectors;
+ 	unsigned int rx_int_idx = 0, tx_int_idx = 0;
+ 	int irq_num, err;
++	int cpu;
+ 
+ 	i40evf_irq_disable(adapter);
+ 	/* Decrement for Other and TCP Timer vectors */
+@@ -584,10 +585,12 @@ i40evf_request_traffic_irqs(struct i40evf_adapter *adapter, char *basename)
+ 		q_vector->affinity_notify.release =
+ 						   i40evf_irq_affinity_release;
+ 		irq_set_affinity_notifier(irq_num, &q_vector->affinity_notify);
+-		/* get_cpu_mask returns a static constant mask with
+-		 * a permanent lifetime so it's ok to use here.
++		/* Spread the IRQ affinity hints across online CPUs. Note that
++		 * get_cpu_mask returns a mask with a permanent lifetime so
++		 * it's safe to use as a hint for irq_set_affinity_hint.
+ 		 */
+-		irq_set_affinity_hint(irq_num, get_cpu_mask(q_vector->v_idx));
++		cpu = cpumask_local_spread(q_vector->v_idx, -1);
++		irq_set_affinity_hint(irq_num, get_cpu_mask(cpu));
+ 	}
+ 
+ 	return 0;
+diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
+index b0031c5ff767..667dbc7d4a4e 100644
+--- a/drivers/net/ethernet/intel/igb/igb_main.c
++++ b/drivers/net/ethernet/intel/igb/igb_main.c
+@@ -3162,6 +3162,8 @@ static int igb_sw_init(struct igb_adapter *adapter)
+ 	/* Setup and initialize a copy of the hw vlan table array */
+ 	adapter->shadow_vfta = kcalloc(E1000_VLAN_FILTER_TBL_SIZE, sizeof(u32),
+ 				       GFP_ATOMIC);
++	if (!adapter->shadow_vfta)
++		return -ENOMEM;
+ 
+ 	/* This call may decrease the number of queues */
+ 	if (igb_init_interrupt_scheme(adapter, true)) {
+diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
+index 6e6ab6f6875e..64429a14c630 100644
+--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
++++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
+@@ -3781,10 +3781,10 @@ s32 ixgbe_set_fw_drv_ver_generic(struct ixgbe_hw *hw, u8 maj, u8 min,
+ 	fw_cmd.ver_build = build;
+ 	fw_cmd.ver_sub = sub;
+ 	fw_cmd.hdr.checksum = 0;
+-	fw_cmd.hdr.checksum = ixgbe_calculate_checksum((u8 *)&fw_cmd,
+-				(FW_CEM_HDR_LEN + fw_cmd.hdr.buf_len));
+ 	fw_cmd.pad = 0;
+ 	fw_cmd.pad2 = 0;
++	fw_cmd.hdr.checksum = ixgbe_calculate_checksum((u8 *)&fw_cmd,
++				(FW_CEM_HDR_LEN + fw_cmd.hdr.buf_len));
+ 
+ 	for (i = 0; i <= FW_CEM_MAX_RETRIES; i++) {
+ 		ret_val = ixgbe_host_interface_command(hw, &fw_cmd,
+diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c
+index 19fbb2f28ea4..8a85217845ae 100644
+--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c
++++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c
+@@ -900,6 +900,8 @@ static s32 ixgbe_read_ee_hostif_buffer_X550(struct ixgbe_hw *hw,
+ 		/* convert offset from words to bytes */
+ 		buffer.address = cpu_to_be32((offset + current_word) * 2);
+ 		buffer.length = cpu_to_be16(words_to_read * 2);
++		buffer.pad2 = 0;
++		buffer.pad3 = 0;
+ 
+ 		status = ixgbe_hic_unlocked(hw, (u32 *)&buffer, sizeof(buffer),
+ 					    IXGBE_HI_COMMAND_TIMEOUT);
+diff --git a/drivers/net/phy/at803x.c b/drivers/net/phy/at803x.c
+index c1e52b9dc58d..5f93e6add563 100644
+--- a/drivers/net/phy/at803x.c
++++ b/drivers/net/phy/at803x.c
+@@ -167,7 +167,7 @@ static int at803x_set_wol(struct phy_device *phydev,
+ 		mac = (const u8 *) ndev->dev_addr;
+ 
+ 		if (!is_valid_ether_addr(mac))
+-			return -EFAULT;
++			return -EINVAL;
+ 
+ 		for (i = 0; i < 3; i++) {
+ 			phy_write(phydev, AT803X_MMD_ACCESS_CONTROL,
+diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
+index ac41c8be9200..0fd8e164339c 100644
+--- a/drivers/pci/iov.c
++++ b/drivers/pci/iov.c
+@@ -162,7 +162,6 @@ int pci_iov_add_virtfn(struct pci_dev *dev, int id, int reset)
+ 
+ 	pci_device_add(virtfn, virtfn->bus);
+ 
+-	pci_bus_add_device(virtfn);
+ 	sprintf(buf, "virtfn%u", id);
+ 	rc = sysfs_create_link(&dev->dev.kobj, &virtfn->dev.kobj, buf);
+ 	if (rc)
+@@ -173,6 +172,8 @@ int pci_iov_add_virtfn(struct pci_dev *dev, int id, int reset)
+ 
+ 	kobject_uevent(&virtfn->dev.kobj, KOBJ_CHANGE);
+ 
++	pci_bus_add_device(virtfn);
++
+ 	return 0;
+ 
+ failed2:
+diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
+index 6078dfc11b11..74f1c57ab93b 100644
+--- a/drivers/pci/pci.c
++++ b/drivers/pci/pci.c
+@@ -4356,6 +4356,10 @@ static bool pci_bus_resetable(struct pci_bus *bus)
+ {
+ 	struct pci_dev *dev;
+ 
++
++	if (bus->self && (bus->self->dev_flags & PCI_DEV_FLAGS_NO_BUS_RESET))
++		return false;
++
+ 	list_for_each_entry(dev, &bus->devices, bus_list) {
+ 		if (dev->dev_flags & PCI_DEV_FLAGS_NO_BUS_RESET ||
+ 		    (dev->subordinate && !pci_bus_resetable(dev->subordinate)))
+diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c
+index 890efcc574cb..744805232155 100644
+--- a/drivers/pci/pcie/aer/aerdrv_core.c
++++ b/drivers/pci/pcie/aer/aerdrv_core.c
+@@ -390,7 +390,14 @@ static pci_ers_result_t broadcast_error_message(struct pci_dev *dev,
+ 		 * If the error is reported by an end point, we think this
+ 		 * error is related to the upstream link of the end point.
+ 		 */
+-		pci_walk_bus(dev->bus, cb, &result_data);
++		if (state == pci_channel_io_normal)
++			/*
++			 * the error is non fatal so the bus is ok, just invoke
++			 * the callback for the function that logged the error.
++			 */
++			cb(dev, &result_data);
++		else
++			pci_walk_bus(dev->bus, cb, &result_data);
+ 	}
+ 
+ 	return result_data.result;
+diff --git a/drivers/platform/x86/asus-wireless.c b/drivers/platform/x86/asus-wireless.c
+index f3796164329e..d4aeac3477f5 100644
+--- a/drivers/platform/x86/asus-wireless.c
++++ b/drivers/platform/x86/asus-wireless.c
+@@ -118,6 +118,7 @@ static void asus_wireless_notify(struct acpi_device *adev, u32 event)
+ 		return;
+ 	}
+ 	input_report_key(data->idev, KEY_RFKILL, 1);
++	input_sync(data->idev);
+ 	input_report_key(data->idev, KEY_RFKILL, 0);
+ 	input_sync(data->idev);
+ }
+diff --git a/drivers/rtc/interface.c b/drivers/rtc/interface.c
+index 8cec9a02c0b8..9eb32ead63db 100644
+--- a/drivers/rtc/interface.c
++++ b/drivers/rtc/interface.c
+@@ -779,7 +779,7 @@ static int rtc_timer_enqueue(struct rtc_device *rtc, struct rtc_timer *timer)
+ 	}
+ 
+ 	timerqueue_add(&rtc->timerqueue, &timer->node);
+-	if (!next) {
++	if (!next || ktime_before(timer->node.expires, next->expires)) {
+ 		struct rtc_wkalrm alarm;
+ 		int err;
+ 		alarm.time = rtc_ktime_to_tm(timer->node.expires);
+diff --git a/drivers/rtc/rtc-pl031.c b/drivers/rtc/rtc-pl031.c
+index e1687e19c59f..a30f24cb6c83 100644
+--- a/drivers/rtc/rtc-pl031.c
++++ b/drivers/rtc/rtc-pl031.c
+@@ -308,7 +308,8 @@ static int pl031_remove(struct amba_device *adev)
+ 
+ 	dev_pm_clear_wake_irq(&adev->dev);
+ 	device_init_wakeup(&adev->dev, false);
+-	free_irq(adev->irq[0], ldata);
++	if (adev->irq[0])
++		free_irq(adev->irq[0], ldata);
+ 	rtc_device_unregister(ldata->rtc);
+ 	iounmap(ldata->base);
+ 	kfree(ldata);
+@@ -381,12 +382,13 @@ static int pl031_probe(struct amba_device *adev, const struct amba_id *id)
+ 		goto out_no_rtc;
+ 	}
+ 
+-	if (request_irq(adev->irq[0], pl031_interrupt,
+-			vendor->irqflags, "rtc-pl031", ldata)) {
+-		ret = -EIO;
+-		goto out_no_irq;
++	if (adev->irq[0]) {
++		ret = request_irq(adev->irq[0], pl031_interrupt,
++				  vendor->irqflags, "rtc-pl031", ldata);
++		if (ret)
++			goto out_no_irq;
++		dev_pm_set_wake_irq(&adev->dev, adev->irq[0]);
+ 	}
+-	dev_pm_set_wake_irq(&adev->dev, adev->irq[0]);
+ 	return 0;
+ 
+ out_no_irq:
+diff --git a/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c b/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c
+index 1d02cf9fe06c..30d5f0ef29bb 100644
+--- a/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c
++++ b/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c
+@@ -1575,6 +1575,7 @@ static void release_offload_resources(struct cxgbi_sock *csk)
+ 		csk, csk->state, csk->flags, csk->tid);
+ 
+ 	cxgbi_sock_free_cpl_skbs(csk);
++	cxgbi_sock_purge_write_queue(csk);
+ 	if (csk->wr_cred != csk->wr_max_cred) {
+ 		cxgbi_sock_purge_wr_queue(csk);
+ 		cxgbi_sock_reset_wr_list(csk);
+diff --git a/drivers/scsi/lpfc/lpfc_hbadisc.c b/drivers/scsi/lpfc/lpfc_hbadisc.c
+index 499df9d17339..d9a03beb76a4 100644
+--- a/drivers/scsi/lpfc/lpfc_hbadisc.c
++++ b/drivers/scsi/lpfc/lpfc_hbadisc.c
+@@ -4983,7 +4983,8 @@ lpfc_nlp_remove(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp)
+ 	lpfc_cancel_retry_delay_tmo(vport, ndlp);
+ 	if ((ndlp->nlp_flag & NLP_DEFER_RM) &&
+ 	    !(ndlp->nlp_flag & NLP_REG_LOGIN_SEND) &&
+-	    !(ndlp->nlp_flag & NLP_RPI_REGISTERED)) {
++	    !(ndlp->nlp_flag & NLP_RPI_REGISTERED) &&
++	    phba->sli_rev != LPFC_SLI_REV4) {
+ 		/* For this case we need to cleanup the default rpi
+ 		 * allocated by the firmware.
+ 		 */
+diff --git a/drivers/scsi/lpfc/lpfc_hw4.h b/drivers/scsi/lpfc/lpfc_hw4.h
+index 1db0a38683f4..2b145966c73f 100644
+--- a/drivers/scsi/lpfc/lpfc_hw4.h
++++ b/drivers/scsi/lpfc/lpfc_hw4.h
+@@ -3636,7 +3636,7 @@ struct lpfc_mbx_get_port_name {
+ #define MB_CEQ_STATUS_QUEUE_FLUSHING		0x4
+ #define MB_CQE_STATUS_DMA_FAILED		0x5
+ 
+-#define LPFC_MBX_WR_CONFIG_MAX_BDE		8
++#define LPFC_MBX_WR_CONFIG_MAX_BDE		1
+ struct lpfc_mbx_wr_object {
+ 	struct mbox_header header;
+ 	union {
+diff --git a/drivers/scsi/lpfc/lpfc_nvmet.c b/drivers/scsi/lpfc/lpfc_nvmet.c
+index 3c5b054a56ac..7ac1a067d780 100644
+--- a/drivers/scsi/lpfc/lpfc_nvmet.c
++++ b/drivers/scsi/lpfc/lpfc_nvmet.c
+@@ -1464,6 +1464,7 @@ static struct lpfc_nvmet_ctxbuf *
+ lpfc_nvmet_replenish_context(struct lpfc_hba *phba,
+ 			     struct lpfc_nvmet_ctx_info *current_infop)
+ {
++#if (IS_ENABLED(CONFIG_NVME_TARGET_FC))
+ 	struct lpfc_nvmet_ctxbuf *ctx_buf = NULL;
+ 	struct lpfc_nvmet_ctx_info *get_infop;
+ 	int i;
+@@ -1511,6 +1512,7 @@ lpfc_nvmet_replenish_context(struct lpfc_hba *phba,
+ 		get_infop = get_infop->nvmet_ctx_next_cpu;
+ 	}
+ 
++#endif
+ 	/* Nothing found, all contexts for the MRQ are in-flight */
+ 	return NULL;
+ }
+diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+index 22998cbd538f..33ff691878e2 100644
+--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
++++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+@@ -4804,6 +4804,11 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply)
+ 		} else if (log_info == VIRTUAL_IO_FAILED_RETRY) {
+ 			scmd->result = DID_RESET << 16;
+ 			break;
++		} else if ((scmd->device->channel == RAID_CHANNEL) &&
++		   (scsi_state == (MPI2_SCSI_STATE_TERMINATED |
++		   MPI2_SCSI_STATE_NO_SCSI_STATUS))) {
++			scmd->result = DID_RESET << 16;
++			break;
+ 		}
+ 		scmd->result = DID_SOFT_ERROR << 16;
+ 		break;
+diff --git a/drivers/staging/greybus/light.c b/drivers/staging/greybus/light.c
+index 3f4148c92308..0f538b8c3a07 100644
+--- a/drivers/staging/greybus/light.c
++++ b/drivers/staging/greybus/light.c
+@@ -925,6 +925,8 @@ static void __gb_lights_led_unregister(struct gb_channel *channel)
+ 		return;
+ 
+ 	led_classdev_unregister(cdev);
++	kfree(cdev->name);
++	cdev->name = NULL;
+ 	channel->led = NULL;
+ }
+ 
+diff --git a/drivers/tee/optee/core.c b/drivers/tee/optee/core.c
+index 7952357df9c8..edb6e4e9ef3a 100644
+--- a/drivers/tee/optee/core.c
++++ b/drivers/tee/optee/core.c
+@@ -590,7 +590,6 @@ static int __init optee_driver_init(void)
+ 		return -ENODEV;
+ 
+ 	np = of_find_matching_node(fw_np, optee_match);
+-	of_node_put(fw_np);
+ 	if (!np)
+ 		return -ENODEV;
+ 
+diff --git a/drivers/thermal/hisi_thermal.c b/drivers/thermal/hisi_thermal.c
+index bd3572c41585..6d8906d65476 100644
+--- a/drivers/thermal/hisi_thermal.c
++++ b/drivers/thermal/hisi_thermal.c
+@@ -35,8 +35,9 @@
+ #define TEMP0_RST_MSK			(0x1C)
+ #define TEMP0_VALUE			(0x28)
+ 
+-#define HISI_TEMP_BASE			(-60)
++#define HISI_TEMP_BASE			(-60000)
+ #define HISI_TEMP_RESET			(100000)
++#define HISI_TEMP_STEP			(784)
+ 
+ #define HISI_MAX_SENSORS		4
+ 
+@@ -61,19 +62,38 @@ struct hisi_thermal_data {
+ 	void __iomem *regs;
+ };
+ 
+-/* in millicelsius */
+-static inline int _step_to_temp(int step)
++/*
++ * The temperature computation on the tsensor is as follow:
++ *	Unit: millidegree Celsius
++ *	Step: 255/200 (0.7843)
++ *	Temperature base: -60°C
++ *
++ * The register is programmed in temperature steps, every step is 784
++ * millidegree and begins at -60 000 m°C
++ *
++ * The temperature from the steps:
++ *
++ *	Temp = TempBase + (steps x 784)
++ *
++ * and the steps from the temperature:
++ *
++ *	steps = (Temp - TempBase) / 784
++ *
++ */
++static inline int hisi_thermal_step_to_temp(int step)
+ {
+-	/*
+-	 * Every step equals (1 * 200) / 255 celsius, and finally
+-	 * need convert to millicelsius.
+-	 */
+-	return (HISI_TEMP_BASE * 1000 + (step * 200000 / 255));
++	return HISI_TEMP_BASE + (step * HISI_TEMP_STEP);
++}
++
++static inline long hisi_thermal_temp_to_step(long temp)
++{
++	return (temp - HISI_TEMP_BASE) / HISI_TEMP_STEP;
+ }
+ 
+-static inline long _temp_to_step(long temp)
++static inline long hisi_thermal_round_temp(int temp)
+ {
+-	return ((temp - HISI_TEMP_BASE * 1000) * 255) / 200000;
++	return hisi_thermal_step_to_temp(
++		hisi_thermal_temp_to_step(temp));
+ }
+ 
+ static long hisi_thermal_get_sensor_temp(struct hisi_thermal_data *data,
+@@ -99,7 +119,7 @@ static long hisi_thermal_get_sensor_temp(struct hisi_thermal_data *data,
+ 	usleep_range(3000, 5000);
+ 
+ 	val = readl(data->regs + TEMP0_VALUE);
+-	val = _step_to_temp(val);
++	val = hisi_thermal_step_to_temp(val);
+ 
+ 	mutex_unlock(&data->thermal_lock);
+ 
+@@ -126,10 +146,11 @@ static void hisi_thermal_enable_bind_irq_sensor
+ 	writel((sensor->id << 12), data->regs + TEMP0_CFG);
+ 
+ 	/* enable for interrupt */
+-	writel(_temp_to_step(sensor->thres_temp) | 0x0FFFFFF00,
++	writel(hisi_thermal_temp_to_step(sensor->thres_temp) | 0x0FFFFFF00,
+ 	       data->regs + TEMP0_TH);
+ 
+-	writel(_temp_to_step(HISI_TEMP_RESET), data->regs + TEMP0_RST_TH);
++	writel(hisi_thermal_temp_to_step(HISI_TEMP_RESET),
++	       data->regs + TEMP0_RST_TH);
+ 
+ 	/* enable module */
+ 	writel(0x1, data->regs + TEMP0_RST_MSK);
+@@ -230,7 +251,7 @@ static irqreturn_t hisi_thermal_alarm_irq_thread(int irq, void *dev)
+ 	sensor = &data->sensors[data->irq_bind_sensor];
+ 
+ 	dev_crit(&data->pdev->dev, "THERMAL ALARM: T > %d\n",
+-		 sensor->thres_temp / 1000);
++		 sensor->thres_temp);
+ 	mutex_unlock(&data->thermal_lock);
+ 
+ 	for (i = 0; i < HISI_MAX_SENSORS; i++) {
+@@ -269,7 +290,7 @@ static int hisi_thermal_register_sensor(struct platform_device *pdev,
+ 
+ 	for (i = 0; i < of_thermal_get_ntrips(sensor->tzd); i++) {
+ 		if (trip[i].type == THERMAL_TRIP_PASSIVE) {
+-			sensor->thres_temp = trip[i].temperature;
++			sensor->thres_temp = hisi_thermal_round_temp(trip[i].temperature);
+ 			break;
+ 		}
+ 	}
+@@ -317,15 +338,6 @@ static int hisi_thermal_probe(struct platform_device *pdev)
+ 	if (data->irq < 0)
+ 		return data->irq;
+ 
+-	ret = devm_request_threaded_irq(&pdev->dev, data->irq,
+-					hisi_thermal_alarm_irq,
+-					hisi_thermal_alarm_irq_thread,
+-					0, "hisi_thermal", data);
+-	if (ret < 0) {
+-		dev_err(&pdev->dev, "failed to request alarm irq: %d\n", ret);
+-		return ret;
+-	}
+-
+ 	platform_set_drvdata(pdev, data);
+ 
+ 	data->clk = devm_clk_get(&pdev->dev, "thermal_clk");
+@@ -345,8 +357,7 @@ static int hisi_thermal_probe(struct platform_device *pdev)
+ 	}
+ 
+ 	hisi_thermal_enable_bind_irq_sensor(data);
+-	irq_get_irqchip_state(data->irq, IRQCHIP_STATE_MASKED,
+-			      &data->irq_enabled);
++	data->irq_enabled = true;
+ 
+ 	for (i = 0; i < HISI_MAX_SENSORS; ++i) {
+ 		ret = hisi_thermal_register_sensor(pdev, data,
+@@ -358,6 +369,17 @@ static int hisi_thermal_probe(struct platform_device *pdev)
+ 			hisi_thermal_toggle_sensor(&data->sensors[i], true);
+ 	}
+ 
++	ret = devm_request_threaded_irq(&pdev->dev, data->irq,
++					hisi_thermal_alarm_irq,
++					hisi_thermal_alarm_irq_thread,
++					0, "hisi_thermal", data);
++	if (ret < 0) {
++		dev_err(&pdev->dev, "failed to request alarm irq: %d\n", ret);
++		return ret;
++	}
++
++	enable_irq(data->irq);
++
+ 	return 0;
+ }
+ 
+diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c
+index 5628fe114347..91335e6de88a 100644
+--- a/drivers/vfio/pci/vfio_pci_config.c
++++ b/drivers/vfio/pci/vfio_pci_config.c
+@@ -849,11 +849,13 @@ static int __init init_pci_cap_exp_perm(struct perm_bits *perm)
+ 
+ 	/*
+ 	 * Allow writes to device control fields, except devctl_phantom,
+-	 * which could confuse IOMMU, and the ARI bit in devctl2, which
++	 * which could confuse IOMMU, MPS, which can break communication
++	 * with other physical devices, and the ARI bit in devctl2, which
+ 	 * is set at probe time.  FLR gets virtualized via our writefn.
+ 	 */
+ 	p_setw(perm, PCI_EXP_DEVCTL,
+-	       PCI_EXP_DEVCTL_BCR_FLR, ~PCI_EXP_DEVCTL_PHANTOM);
++	       PCI_EXP_DEVCTL_BCR_FLR | PCI_EXP_DEVCTL_PAYLOAD,
++	       ~PCI_EXP_DEVCTL_PHANTOM);
+ 	p_setw(perm, PCI_EXP_DEVCTL2, NO_VIRT, ~PCI_EXP_DEVCTL2_ARI);
+ 	return 0;
+ }
+diff --git a/drivers/video/backlight/pwm_bl.c b/drivers/video/backlight/pwm_bl.c
+index 9bd17682655a..1c2289ddd555 100644
+--- a/drivers/video/backlight/pwm_bl.c
++++ b/drivers/video/backlight/pwm_bl.c
+@@ -79,14 +79,17 @@ static void pwm_backlight_power_off(struct pwm_bl_data *pb)
+ static int compute_duty_cycle(struct pwm_bl_data *pb, int brightness)
+ {
+ 	unsigned int lth = pb->lth_brightness;
+-	int duty_cycle;
++	u64 duty_cycle;
+ 
+ 	if (pb->levels)
+ 		duty_cycle = pb->levels[brightness];
+ 	else
+ 		duty_cycle = brightness;
+ 
+-	return (duty_cycle * (pb->period - lth) / pb->scale) + lth;
++	duty_cycle *= pb->period - lth;
++	do_div(duty_cycle, pb->scale);
++
++	return duty_cycle + lth;
+ }
+ 
+ static int pwm_backlight_update_status(struct backlight_device *bl)
+diff --git a/fs/dcache.c b/fs/dcache.c
+index f90141387f01..34c852af215c 100644
+--- a/fs/dcache.c
++++ b/fs/dcache.c
+@@ -231,7 +231,7 @@ static inline int dentry_cmp(const struct dentry *dentry, const unsigned char *c
+ {
+ 	/*
+ 	 * Be careful about RCU walk racing with rename:
+-	 * use 'lockless_dereference' to fetch the name pointer.
++	 * use 'READ_ONCE' to fetch the name pointer.
+ 	 *
+ 	 * NOTE! Even if a rename will mean that the length
+ 	 * was not loaded atomically, we don't care. The
+@@ -245,7 +245,7 @@ static inline int dentry_cmp(const struct dentry *dentry, const unsigned char *c
+ 	 * early because the data cannot match (there can
+ 	 * be no NUL in the ct/tcount data)
+ 	 */
+-	const unsigned char *cs = lockless_dereference(dentry->d_name.name);
++	const unsigned char *cs = READ_ONCE(dentry->d_name.name);
+ 
+ 	return dentry_string_cmp(cs, ct, tcount);
+ }
+diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
+index 25d9b5adcd42..36b49bd09264 100644
+--- a/fs/overlayfs/ovl_entry.h
++++ b/fs/overlayfs/ovl_entry.h
+@@ -77,5 +77,5 @@ static inline struct ovl_inode *OVL_I(struct inode *inode)
+ 
+ static inline struct dentry *ovl_upperdentry_dereference(struct ovl_inode *oi)
+ {
+-	return lockless_dereference(oi->__upperdentry);
++	return READ_ONCE(oi->__upperdentry);
+ }
+diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
+index b2c7f33e08fc..d94a51dc4e32 100644
+--- a/fs/overlayfs/readdir.c
++++ b/fs/overlayfs/readdir.c
+@@ -757,7 +757,7 @@ static int ovl_dir_fsync(struct file *file, loff_t start, loff_t end,
+ 	if (!od->is_upper && OVL_TYPE_UPPER(ovl_path_type(dentry))) {
+ 		struct inode *inode = file_inode(file);
+ 
+-		realfile = lockless_dereference(od->upperfile);
++		realfile = READ_ONCE(od->upperfile);
+ 		if (!realfile) {
+ 			struct path upperpath;
+ 
+diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
+index e549bff87c5b..353f52fdc35e 100644
+--- a/include/asm-generic/vmlinux.lds.h
++++ b/include/asm-generic/vmlinux.lds.h
+@@ -688,7 +688,7 @@
+ #define BUG_TABLE
+ #endif
+ 
+-#ifdef CONFIG_ORC_UNWINDER
++#ifdef CONFIG_UNWINDER_ORC
+ #define ORC_UNWIND_TABLE						\
+ 	. = ALIGN(4);							\
+ 	.orc_unwind_ip : AT(ADDR(.orc_unwind_ip) - LOAD_OFFSET) {	\
+diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
+index b8d200f60a40..73bec75b74c8 100644
+--- a/include/linux/bpf_verifier.h
++++ b/include/linux/bpf_verifier.h
+@@ -15,11 +15,11 @@
+  * In practice this is far bigger than any realistic pointer offset; this limit
+  * ensures that umax_value + (int)off + (int)size cannot overflow a u64.
+  */
+-#define BPF_MAX_VAR_OFF	(1ULL << 31)
++#define BPF_MAX_VAR_OFF	(1 << 29)
+ /* Maximum variable size permitted for ARG_CONST_SIZE[_OR_ZERO].  This ensures
+  * that converting umax_value to int cannot overflow.
+  */
+-#define BPF_MAX_VAR_SIZ	INT_MAX
++#define BPF_MAX_VAR_SIZ	(1 << 29)
+ 
+ /* Liveness marks, used for registers and spilled-regs (in stack slots).
+  * Read marks propagate upwards until they find a write mark; they record that
+@@ -110,7 +110,7 @@ struct bpf_insn_aux_data {
+ 		struct bpf_map *map_ptr;	/* pointer for call insn into lookup_elem */
+ 	};
+ 	int ctx_field_size; /* the ctx field size for load insn, maybe 0 */
+-	int converted_op_size; /* the valid value width after perceived conversion */
++	bool seen; /* this insn was processed by the verifier */
+ };
+ 
+ #define MAX_USED_MAPS 64 /* max number of maps accessed by one eBPF program */
+diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
+index 780b1242bf24..3b609edffa8f 100644
+--- a/include/linux/compiler-clang.h
++++ b/include/linux/compiler-clang.h
+@@ -1,5 +1,5 @@
+ /* SPDX-License-Identifier: GPL-2.0 */
+-#ifndef __LINUX_COMPILER_H
++#ifndef __LINUX_COMPILER_TYPES_H
+ #error "Please don't include <linux/compiler-clang.h> directly, include <linux/compiler.h> instead."
+ #endif
+ 
+diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
+index bb78e5bdff26..2272ded07496 100644
+--- a/include/linux/compiler-gcc.h
++++ b/include/linux/compiler-gcc.h
+@@ -1,5 +1,5 @@
+ /* SPDX-License-Identifier: GPL-2.0 */
+-#ifndef __LINUX_COMPILER_H
++#ifndef __LINUX_COMPILER_TYPES_H
+ #error "Please don't include <linux/compiler-gcc.h> directly, include <linux/compiler.h> instead."
+ #endif
+ 
+diff --git a/include/linux/compiler-intel.h b/include/linux/compiler-intel.h
+index 523d1b74550f..bfa08160db3a 100644
+--- a/include/linux/compiler-intel.h
++++ b/include/linux/compiler-intel.h
+@@ -1,5 +1,5 @@
+ /* SPDX-License-Identifier: GPL-2.0 */
+-#ifndef __LINUX_COMPILER_H
++#ifndef __LINUX_COMPILER_TYPES_H
+ #error "Please don't include <linux/compiler-intel.h> directly, include <linux/compiler.h> instead."
+ #endif
+ 
+diff --git a/include/linux/compiler.h b/include/linux/compiler.h
+index 202710420d6d..fab5dc250c61 100644
+--- a/include/linux/compiler.h
++++ b/include/linux/compiler.h
+@@ -2,111 +2,12 @@
+ #ifndef __LINUX_COMPILER_H
+ #define __LINUX_COMPILER_H
+ 
+-#ifndef __ASSEMBLY__
++#include <linux/compiler_types.h>
+ 
+-#ifdef __CHECKER__
+-# define __user		__attribute__((noderef, address_space(1)))
+-# define __kernel	__attribute__((address_space(0)))
+-# define __safe		__attribute__((safe))
+-# define __force	__attribute__((force))
+-# define __nocast	__attribute__((nocast))
+-# define __iomem	__attribute__((noderef, address_space(2)))
+-# define __must_hold(x)	__attribute__((context(x,1,1)))
+-# define __acquires(x)	__attribute__((context(x,0,1)))
+-# define __releases(x)	__attribute__((context(x,1,0)))
+-# define __acquire(x)	__context__(x,1)
+-# define __release(x)	__context__(x,-1)
+-# define __cond_lock(x,c)	((c) ? ({ __acquire(x); 1; }) : 0)
+-# define __percpu	__attribute__((noderef, address_space(3)))
+-# define __rcu		__attribute__((noderef, address_space(4)))
+-# define __private	__attribute__((noderef))
+-extern void __chk_user_ptr(const volatile void __user *);
+-extern void __chk_io_ptr(const volatile void __iomem *);
+-# define ACCESS_PRIVATE(p, member) (*((typeof((p)->member) __force *) &(p)->member))
+-#else /* __CHECKER__ */
+-# ifdef STRUCTLEAK_PLUGIN
+-#  define __user __attribute__((user))
+-# else
+-#  define __user
+-# endif
+-# define __kernel
+-# define __safe
+-# define __force
+-# define __nocast
+-# define __iomem
+-# define __chk_user_ptr(x) (void)0
+-# define __chk_io_ptr(x) (void)0
+-# define __builtin_warning(x, y...) (1)
+-# define __must_hold(x)
+-# define __acquires(x)
+-# define __releases(x)
+-# define __acquire(x) (void)0
+-# define __release(x) (void)0
+-# define __cond_lock(x,c) (c)
+-# define __percpu
+-# define __rcu
+-# define __private
+-# define ACCESS_PRIVATE(p, member) ((p)->member)
+-#endif /* __CHECKER__ */
+-
+-/* Indirect macros required for expanded argument pasting, eg. __LINE__. */
+-#define ___PASTE(a,b) a##b
+-#define __PASTE(a,b) ___PASTE(a,b)
++#ifndef __ASSEMBLY__
+ 
+ #ifdef __KERNEL__
+ 
+-#ifdef __GNUC__
+-#include <linux/compiler-gcc.h>
+-#endif
+-
+-#if defined(CC_USING_HOTPATCH) && !defined(__CHECKER__)
+-#define notrace __attribute__((hotpatch(0,0)))
+-#else
+-#define notrace __attribute__((no_instrument_function))
+-#endif
+-
+-/* Intel compiler defines __GNUC__. So we will overwrite implementations
+- * coming from above header files here
+- */
+-#ifdef __INTEL_COMPILER
+-# include <linux/compiler-intel.h>
+-#endif
+-
+-/* Clang compiler defines __GNUC__. So we will overwrite implementations
+- * coming from above header files here
+- */
+-#ifdef __clang__
+-#include <linux/compiler-clang.h>
+-#endif
+-
+-/*
+- * Generic compiler-dependent macros required for kernel
+- * build go below this comment. Actual compiler/compiler version
+- * specific implementations come from the above header files
+- */
+-
+-struct ftrace_branch_data {
+-	const char *func;
+-	const char *file;
+-	unsigned line;
+-	union {
+-		struct {
+-			unsigned long correct;
+-			unsigned long incorrect;
+-		};
+-		struct {
+-			unsigned long miss;
+-			unsigned long hit;
+-		};
+-		unsigned long miss_hit[2];
+-	};
+-};
+-
+-struct ftrace_likely_data {
+-	struct ftrace_branch_data	data;
+-	unsigned long			constant;
+-};
+-
+ /*
+  * Note: DISABLE_BRANCH_PROFILING can be used by special lowlevel code
+  * to disable branch tracing on a per file basis.
+@@ -333,6 +234,7 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s
+  * with an explicit memory barrier or atomic instruction that provides the
+  * required ordering.
+  */
++#include <asm/barrier.h>
+ 
+ #define __READ_ONCE(x, check)						\
+ ({									\
+@@ -341,6 +243,7 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s
+ 		__read_once_size(&(x), __u.__c, sizeof(x));		\
+ 	else								\
+ 		__read_once_size_nocheck(&(x), __u.__c, sizeof(x));	\
++	smp_read_barrier_depends(); /* Enforce dependency ordering from x */ \
+ 	__u.__val;							\
+ })
+ #define READ_ONCE(x) __READ_ONCE(x, 1)
+@@ -363,167 +266,6 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s
+ 
+ #endif /* __ASSEMBLY__ */
+ 
+-#ifdef __KERNEL__
+-/*
+- * Allow us to mark functions as 'deprecated' and have gcc emit a nice
+- * warning for each use, in hopes of speeding the functions removal.
+- * Usage is:
+- * 		int __deprecated foo(void)
+- */
+-#ifndef __deprecated
+-# define __deprecated		/* unimplemented */
+-#endif
+-
+-#ifdef MODULE
+-#define __deprecated_for_modules __deprecated
+-#else
+-#define __deprecated_for_modules
+-#endif
+-
+-#ifndef __must_check
+-#define __must_check
+-#endif
+-
+-#ifndef CONFIG_ENABLE_MUST_CHECK
+-#undef __must_check
+-#define __must_check
+-#endif
+-#ifndef CONFIG_ENABLE_WARN_DEPRECATED
+-#undef __deprecated
+-#undef __deprecated_for_modules
+-#define __deprecated
+-#define __deprecated_for_modules
+-#endif
+-
+-#ifndef __malloc
+-#define __malloc
+-#endif
+-
+-/*
+- * Allow us to avoid 'defined but not used' warnings on functions and data,
+- * as well as force them to be emitted to the assembly file.
+- *
+- * As of gcc 3.4, static functions that are not marked with attribute((used))
+- * may be elided from the assembly file.  As of gcc 3.4, static data not so
+- * marked will not be elided, but this may change in a future gcc version.
+- *
+- * NOTE: Because distributions shipped with a backported unit-at-a-time
+- * compiler in gcc 3.3, we must define __used to be __attribute__((used))
+- * for gcc >=3.3 instead of 3.4.
+- *
+- * In prior versions of gcc, such functions and data would be emitted, but
+- * would be warned about except with attribute((unused)).
+- *
+- * Mark functions that are referenced only in inline assembly as __used so
+- * the code is emitted even though it appears to be unreferenced.
+- */
+-#ifndef __used
+-# define __used			/* unimplemented */
+-#endif
+-
+-#ifndef __maybe_unused
+-# define __maybe_unused		/* unimplemented */
+-#endif
+-
+-#ifndef __always_unused
+-# define __always_unused	/* unimplemented */
+-#endif
+-
+-#ifndef noinline
+-#define noinline
+-#endif
+-
+-/*
+- * Rather then using noinline to prevent stack consumption, use
+- * noinline_for_stack instead.  For documentation reasons.
+- */
+-#define noinline_for_stack noinline
+-
+-#ifndef __always_inline
+-#define __always_inline inline
+-#endif
+-
+-#endif /* __KERNEL__ */
+-
+-/*
+- * From the GCC manual:
+- *
+- * Many functions do not examine any values except their arguments,
+- * and have no effects except the return value.  Basically this is
+- * just slightly more strict class than the `pure' attribute above,
+- * since function is not allowed to read global memory.
+- *
+- * Note that a function that has pointer arguments and examines the
+- * data pointed to must _not_ be declared `const'.  Likewise, a
+- * function that calls a non-`const' function usually must not be
+- * `const'.  It does not make sense for a `const' function to return
+- * `void'.
+- */
+-#ifndef __attribute_const__
+-# define __attribute_const__	/* unimplemented */
+-#endif
+-
+-#ifndef __designated_init
+-# define __designated_init
+-#endif
+-
+-#ifndef __latent_entropy
+-# define __latent_entropy
+-#endif
+-
+-#ifndef __randomize_layout
+-# define __randomize_layout __designated_init
+-#endif
+-
+-#ifndef __no_randomize_layout
+-# define __no_randomize_layout
+-#endif
+-
+-#ifndef randomized_struct_fields_start
+-# define randomized_struct_fields_start
+-# define randomized_struct_fields_end
+-#endif
+-
+-/*
+- * Tell gcc if a function is cold. The compiler will assume any path
+- * directly leading to the call is unlikely.
+- */
+-
+-#ifndef __cold
+-#define __cold
+-#endif
+-
+-/* Simple shorthand for a section definition */
+-#ifndef __section
+-# define __section(S) __attribute__ ((__section__(#S)))
+-#endif
+-
+-#ifndef __visible
+-#define __visible
+-#endif
+-
+-#ifndef __nostackprotector
+-# define __nostackprotector
+-#endif
+-
+-/*
+- * Assume alignment of return value.
+- */
+-#ifndef __assume_aligned
+-#define __assume_aligned(a, ...)
+-#endif
+-
+-
+-/* Are two types/vars the same type (ignoring qualifiers)? */
+-#ifndef __same_type
+-# define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
+-#endif
+-
+-/* Is this type a native word size -- useful for atomic operations */
+-#ifndef __native_word
+-# define __native_word(t) (sizeof(t) == sizeof(char) || sizeof(t) == sizeof(short) || sizeof(t) == sizeof(int) || sizeof(t) == sizeof(long))
+-#endif
+-
+ /* Compile time object size, -1 for unknown */
+ #ifndef __compiletime_object_size
+ # define __compiletime_object_size(obj) -1
+diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
+new file mode 100644
+index 000000000000..6b79a9bba9a7
+--- /dev/null
++++ b/include/linux/compiler_types.h
+@@ -0,0 +1,274 @@
++#ifndef __LINUX_COMPILER_TYPES_H
++#define __LINUX_COMPILER_TYPES_H
++
++#ifndef __ASSEMBLY__
++
++#ifdef __CHECKER__
++# define __user		__attribute__((noderef, address_space(1)))
++# define __kernel	__attribute__((address_space(0)))
++# define __safe		__attribute__((safe))
++# define __force	__attribute__((force))
++# define __nocast	__attribute__((nocast))
++# define __iomem	__attribute__((noderef, address_space(2)))
++# define __must_hold(x)	__attribute__((context(x,1,1)))
++# define __acquires(x)	__attribute__((context(x,0,1)))
++# define __releases(x)	__attribute__((context(x,1,0)))
++# define __acquire(x)	__context__(x,1)
++# define __release(x)	__context__(x,-1)
++# define __cond_lock(x,c)	((c) ? ({ __acquire(x); 1; }) : 0)
++# define __percpu	__attribute__((noderef, address_space(3)))
++# define __rcu		__attribute__((noderef, address_space(4)))
++# define __private	__attribute__((noderef))
++extern void __chk_user_ptr(const volatile void __user *);
++extern void __chk_io_ptr(const volatile void __iomem *);
++# define ACCESS_PRIVATE(p, member) (*((typeof((p)->member) __force *) &(p)->member))
++#else /* __CHECKER__ */
++# ifdef STRUCTLEAK_PLUGIN
++#  define __user __attribute__((user))
++# else
++#  define __user
++# endif
++# define __kernel
++# define __safe
++# define __force
++# define __nocast
++# define __iomem
++# define __chk_user_ptr(x) (void)0
++# define __chk_io_ptr(x) (void)0
++# define __builtin_warning(x, y...) (1)
++# define __must_hold(x)
++# define __acquires(x)
++# define __releases(x)
++# define __acquire(x) (void)0
++# define __release(x) (void)0
++# define __cond_lock(x,c) (c)
++# define __percpu
++# define __rcu
++# define __private
++# define ACCESS_PRIVATE(p, member) ((p)->member)
++#endif /* __CHECKER__ */
++
++/* Indirect macros required for expanded argument pasting, eg. __LINE__. */
++#define ___PASTE(a,b) a##b
++#define __PASTE(a,b) ___PASTE(a,b)
++
++#ifdef __KERNEL__
++
++#ifdef __GNUC__
++#include <linux/compiler-gcc.h>
++#endif
++
++#if defined(CC_USING_HOTPATCH) && !defined(__CHECKER__)
++#define notrace __attribute__((hotpatch(0,0)))
++#else
++#define notrace __attribute__((no_instrument_function))
++#endif
++
++/* Intel compiler defines __GNUC__. So we will overwrite implementations
++ * coming from above header files here
++ */
++#ifdef __INTEL_COMPILER
++# include <linux/compiler-intel.h>
++#endif
++
++/* Clang compiler defines __GNUC__. So we will overwrite implementations
++ * coming from above header files here
++ */
++#ifdef __clang__
++#include <linux/compiler-clang.h>
++#endif
++
++/*
++ * Generic compiler-dependent macros required for kernel
++ * build go below this comment. Actual compiler/compiler version
++ * specific implementations come from the above header files
++ */
++
++struct ftrace_branch_data {
++	const char *func;
++	const char *file;
++	unsigned line;
++	union {
++		struct {
++			unsigned long correct;
++			unsigned long incorrect;
++		};
++		struct {
++			unsigned long miss;
++			unsigned long hit;
++		};
++		unsigned long miss_hit[2];
++	};
++};
++
++struct ftrace_likely_data {
++	struct ftrace_branch_data	data;
++	unsigned long			constant;
++};
++
++#endif /* __KERNEL__ */
++
++#endif /* __ASSEMBLY__ */
++
++#ifdef __KERNEL__
++/*
++ * Allow us to mark functions as 'deprecated' and have gcc emit a nice
++ * warning for each use, in hopes of speeding the functions removal.
++ * Usage is:
++ * 		int __deprecated foo(void)
++ */
++#ifndef __deprecated
++# define __deprecated		/* unimplemented */
++#endif
++
++#ifdef MODULE
++#define __deprecated_for_modules __deprecated
++#else
++#define __deprecated_for_modules
++#endif
++
++#ifndef __must_check
++#define __must_check
++#endif
++
++#ifndef CONFIG_ENABLE_MUST_CHECK
++#undef __must_check
++#define __must_check
++#endif
++#ifndef CONFIG_ENABLE_WARN_DEPRECATED
++#undef __deprecated
++#undef __deprecated_for_modules
++#define __deprecated
++#define __deprecated_for_modules
++#endif
++
++#ifndef __malloc
++#define __malloc
++#endif
++
++/*
++ * Allow us to avoid 'defined but not used' warnings on functions and data,
++ * as well as force them to be emitted to the assembly file.
++ *
++ * As of gcc 3.4, static functions that are not marked with attribute((used))
++ * may be elided from the assembly file.  As of gcc 3.4, static data not so
++ * marked will not be elided, but this may change in a future gcc version.
++ *
++ * NOTE: Because distributions shipped with a backported unit-at-a-time
++ * compiler in gcc 3.3, we must define __used to be __attribute__((used))
++ * for gcc >=3.3 instead of 3.4.
++ *
++ * In prior versions of gcc, such functions and data would be emitted, but
++ * would be warned about except with attribute((unused)).
++ *
++ * Mark functions that are referenced only in inline assembly as __used so
++ * the code is emitted even though it appears to be unreferenced.
++ */
++#ifndef __used
++# define __used			/* unimplemented */
++#endif
++
++#ifndef __maybe_unused
++# define __maybe_unused		/* unimplemented */
++#endif
++
++#ifndef __always_unused
++# define __always_unused	/* unimplemented */
++#endif
++
++#ifndef noinline
++#define noinline
++#endif
++
++/*
++ * Rather then using noinline to prevent stack consumption, use
++ * noinline_for_stack instead.  For documentation reasons.
++ */
++#define noinline_for_stack noinline
++
++#ifndef __always_inline
++#define __always_inline inline
++#endif
++
++#endif /* __KERNEL__ */
++
++/*
++ * From the GCC manual:
++ *
++ * Many functions do not examine any values except their arguments,
++ * and have no effects except the return value.  Basically this is
++ * just slightly more strict class than the `pure' attribute above,
++ * since function is not allowed to read global memory.
++ *
++ * Note that a function that has pointer arguments and examines the
++ * data pointed to must _not_ be declared `const'.  Likewise, a
++ * function that calls a non-`const' function usually must not be
++ * `const'.  It does not make sense for a `const' function to return
++ * `void'.
++ */
++#ifndef __attribute_const__
++# define __attribute_const__	/* unimplemented */
++#endif
++
++#ifndef __designated_init
++# define __designated_init
++#endif
++
++#ifndef __latent_entropy
++# define __latent_entropy
++#endif
++
++#ifndef __randomize_layout
++# define __randomize_layout __designated_init
++#endif
++
++#ifndef __no_randomize_layout
++# define __no_randomize_layout
++#endif
++
++#ifndef randomized_struct_fields_start
++# define randomized_struct_fields_start
++# define randomized_struct_fields_end
++#endif
++
++/*
++ * Tell gcc if a function is cold. The compiler will assume any path
++ * directly leading to the call is unlikely.
++ */
++
++#ifndef __cold
++#define __cold
++#endif
++
++/* Simple shorthand for a section definition */
++#ifndef __section
++# define __section(S) __attribute__ ((__section__(#S)))
++#endif
++
++#ifndef __visible
++#define __visible
++#endif
++
++#ifndef __nostackprotector
++# define __nostackprotector
++#endif
++
++/*
++ * Assume alignment of return value.
++ */
++#ifndef __assume_aligned
++#define __assume_aligned(a, ...)
++#endif
++
++
++/* Are two types/vars the same type (ignoring qualifiers)? */
++#ifndef __same_type
++# define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
++#endif
++
++/* Is this type a native word size -- useful for atomic operations */
++#ifndef __native_word
++# define __native_word(t) (sizeof(t) == sizeof(char) || sizeof(t) == sizeof(short) || sizeof(t) == sizeof(int) || sizeof(t) == sizeof(long))
++#endif
++
++#endif /* __LINUX_COMPILER_TYPES_H */
+diff --git a/include/linux/hypervisor.h b/include/linux/hypervisor.h
+index b4054fd5b6f6..b19563f9a8eb 100644
+--- a/include/linux/hypervisor.h
++++ b/include/linux/hypervisor.h
+@@ -7,8 +7,12 @@
+  *		Juergen Gross <jgross@suse.com>
+  */
+ 
+-#ifdef CONFIG_HYPERVISOR_GUEST
+-#include <asm/hypervisor.h>
++#ifdef CONFIG_X86
++#include <asm/x86_init.h>
++static inline void hypervisor_pin_vcpu(int cpu)
++{
++	x86_platform.hyper.pin_vcpu(cpu);
++}
+ #else
+ static inline void hypervisor_pin_vcpu(int cpu)
+ {
+diff --git a/include/linux/iio/common/st_sensors.h b/include/linux/iio/common/st_sensors.h
+index 7b0fa8b5c120..ce0ef1c0a30a 100644
+--- a/include/linux/iio/common/st_sensors.h
++++ b/include/linux/iio/common/st_sensors.h
+@@ -139,7 +139,7 @@ struct st_sensor_das {
+  * @mask_ihl: mask to enable/disable active low on the INT lines.
+  * @addr_od: address to enable/disable Open Drain on the INT lines.
+  * @mask_od: mask to enable/disable Open Drain on the INT lines.
+- * @addr_stat_drdy: address to read status of DRDY (data ready) interrupt
++ * struct stat_drdy - status register of DRDY (data ready) interrupt.
+  * struct ig1 - represents the Interrupt Generator 1 of sensors.
+  * @en_addr: address of the enable ig1 register.
+  * @en_mask: mask to write the on/off value for enable.
+@@ -152,7 +152,10 @@ struct st_sensor_data_ready_irq {
+ 	u8 mask_ihl;
+ 	u8 addr_od;
+ 	u8 mask_od;
+-	u8 addr_stat_drdy;
++	struct {
++		u8 addr;
++		u8 mask;
++	} stat_drdy;
+ 	struct {
+ 		u8 en_addr;
+ 		u8 en_mask;
+diff --git a/include/linux/intel-pti.h b/include/linux/intel-pti.h
+new file mode 100644
+index 000000000000..2710d72de3c9
+--- /dev/null
++++ b/include/linux/intel-pti.h
+@@ -0,0 +1,43 @@
++/*
++ *  Copyright (C) Intel 2011
++ *
++ * This program is free software; you can redistribute it and/or modify
++ * it under the terms of the GNU General Public License version 2 as
++ * published by the Free Software Foundation.
++ *
++ * This program is distributed in the hope that it will be useful,
++ * but WITHOUT ANY WARRANTY; without even the implied warranty of
++ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
++ * GNU General Public License for more details.
++ *
++ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
++ *
++ * The PTI (Parallel Trace Interface) driver directs trace data routed from
++ * various parts in the system out through the Intel Penwell PTI port and
++ * out of the mobile device for analysis with a debugging tool
++ * (Lauterbach, Fido). This is part of a solution for the MIPI P1149.7,
++ * compact JTAG, standard.
++ *
++ * This header file will allow other parts of the OS to use the
++ * interface to write out it's contents for debugging a mobile system.
++ */
++
++#ifndef LINUX_INTEL_PTI_H_
++#define LINUX_INTEL_PTI_H_
++
++/* offset for last dword of any PTI message. Part of MIPI P1149.7 */
++#define PTI_LASTDWORD_DTS	0x30
++
++/* basic structure used as a write address to the PTI HW */
++struct pti_masterchannel {
++	u8 master;
++	u8 channel;
++};
++
++/* the following functions are defined in misc/pti.c */
++void pti_writedata(struct pti_masterchannel *mc, u8 *buf, int count);
++struct pti_masterchannel *pti_request_masterchannel(u8 type,
++						    const char *thread_name);
++void pti_release_masterchannel(struct pti_masterchannel *mc);
++
++#endif /* LINUX_INTEL_PTI_H_ */
+diff --git a/include/linux/linkage.h b/include/linux/linkage.h
+index 2e6f90bd52aa..f68db9e450eb 100644
+--- a/include/linux/linkage.h
++++ b/include/linux/linkage.h
+@@ -2,7 +2,7 @@
+ #ifndef _LINUX_LINKAGE_H
+ #define _LINUX_LINKAGE_H
+ 
+-#include <linux/compiler.h>
++#include <linux/compiler_types.h>
+ #include <linux/stringify.h>
+ #include <linux/export.h>
+ #include <asm/linkage.h>
+diff --git a/include/linux/mm.h b/include/linux/mm.h
+index db647d428100..f50deada0f5c 100644
+--- a/include/linux/mm.h
++++ b/include/linux/mm.h
+@@ -2510,7 +2510,7 @@ void vmemmap_populate_print_last(void);
+ void vmemmap_free(unsigned long start, unsigned long end);
+ #endif
+ void register_page_bootmem_memmap(unsigned long section_nr, struct page *map,
+-				  unsigned long size);
++				  unsigned long nr_pages);
+ 
+ enum mf_flags {
+ 	MF_COUNT_INCREASED = 1 << 0,
+diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
+index 18b06983131a..f0938257ee6d 100644
+--- a/include/linux/mmzone.h
++++ b/include/linux/mmzone.h
+@@ -1152,13 +1152,17 @@ struct mem_section {
+ #define SECTION_ROOT_MASK	(SECTIONS_PER_ROOT - 1)
+ 
+ #ifdef CONFIG_SPARSEMEM_EXTREME
+-extern struct mem_section *mem_section[NR_SECTION_ROOTS];
++extern struct mem_section **mem_section;
+ #else
+ extern struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT];
+ #endif
+ 
+ static inline struct mem_section *__nr_to_section(unsigned long nr)
+ {
++#ifdef CONFIG_SPARSEMEM_EXTREME
++	if (!mem_section)
++		return NULL;
++#endif
+ 	if (!mem_section[SECTION_NR_TO_ROOT(nr)])
+ 		return NULL;
+ 	return &mem_section[SECTION_NR_TO_ROOT(nr)][nr & SECTION_ROOT_MASK];
+diff --git a/include/linux/pti.h b/include/linux/pti.h
+deleted file mode 100644
+index b3ea01a3197e..000000000000
+--- a/include/linux/pti.h
++++ /dev/null
+@@ -1,43 +0,0 @@
+-/*
+- *  Copyright (C) Intel 2011
+- *
+- * This program is free software; you can redistribute it and/or modify
+- * it under the terms of the GNU General Public License version 2 as
+- * published by the Free Software Foundation.
+- *
+- * This program is distributed in the hope that it will be useful,
+- * but WITHOUT ANY WARRANTY; without even the implied warranty of
+- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+- * GNU General Public License for more details.
+- *
+- * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+- *
+- * The PTI (Parallel Trace Interface) driver directs trace data routed from
+- * various parts in the system out through the Intel Penwell PTI port and
+- * out of the mobile device for analysis with a debugging tool
+- * (Lauterbach, Fido). This is part of a solution for the MIPI P1149.7,
+- * compact JTAG, standard.
+- *
+- * This header file will allow other parts of the OS to use the
+- * interface to write out it's contents for debugging a mobile system.
+- */
+-
+-#ifndef PTI_H_
+-#define PTI_H_
+-
+-/* offset for last dword of any PTI message. Part of MIPI P1149.7 */
+-#define PTI_LASTDWORD_DTS	0x30
+-
+-/* basic structure used as a write address to the PTI HW */
+-struct pti_masterchannel {
+-	u8 master;
+-	u8 channel;
+-};
+-
+-/* the following functions are defined in misc/pti.c */
+-void pti_writedata(struct pti_masterchannel *mc, u8 *buf, int count);
+-struct pti_masterchannel *pti_request_masterchannel(u8 type,
+-						    const char *thread_name);
+-void pti_release_masterchannel(struct pti_masterchannel *mc);
+-
+-#endif /*PTI_H_*/
+diff --git a/include/linux/rculist.h b/include/linux/rculist.h
+index c2cdd45a880a..127f534fec94 100644
+--- a/include/linux/rculist.h
++++ b/include/linux/rculist.h
+@@ -275,7 +275,7 @@ static inline void list_splice_tail_init_rcu(struct list_head *list,
+  * primitives such as list_add_rcu() as long as it's guarded by rcu_read_lock().
+  */
+ #define list_entry_rcu(ptr, type, member) \
+-	container_of(lockless_dereference(ptr), type, member)
++	container_of(READ_ONCE(ptr), type, member)
+ 
+ /*
+  * Where are list_empty_rcu() and list_first_entry_rcu()?
+@@ -368,7 +368,7 @@ static inline void list_splice_tail_init_rcu(struct list_head *list,
+  * example is when items are added to the list, but never deleted.
+  */
+ #define list_entry_lockless(ptr, type, member) \
+-	container_of((typeof(ptr))lockless_dereference(ptr), type, member)
++	container_of((typeof(ptr))READ_ONCE(ptr), type, member)
+ 
+ /**
+  * list_for_each_entry_lockless - iterate over rcu list of given type
+diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
+index 1a9f70d44af9..a6ddc42f87a5 100644
+--- a/include/linux/rcupdate.h
++++ b/include/linux/rcupdate.h
+@@ -346,7 +346,7 @@ static inline void rcu_preempt_sleep_check(void) { }
+ #define __rcu_dereference_check(p, c, space) \
+ ({ \
+ 	/* Dependency order vs. p above. */ \
+-	typeof(*p) *________p1 = (typeof(*p) *__force)lockless_dereference(p); \
++	typeof(*p) *________p1 = (typeof(*p) *__force)READ_ONCE(p); \
+ 	RCU_LOCKDEP_WARN(!(c), "suspicious rcu_dereference_check() usage"); \
+ 	rcu_dereference_sparse(p, space); \
+ 	((typeof(*p) __force __kernel *)(________p1)); \
+@@ -360,7 +360,7 @@ static inline void rcu_preempt_sleep_check(void) { }
+ #define rcu_dereference_raw(p) \
+ ({ \
+ 	/* Dependency order vs. p above. */ \
+-	typeof(p) ________p1 = lockless_dereference(p); \
++	typeof(p) ________p1 = READ_ONCE(p); \
+ 	((typeof(*p) __force __kernel *)(________p1)); \
+ })
+ 
+diff --git a/include/uapi/linux/stddef.h b/include/uapi/linux/stddef.h
+index f65b92e0e1f9..ee8220f8dcf5 100644
+--- a/include/uapi/linux/stddef.h
++++ b/include/uapi/linux/stddef.h
+@@ -1,5 +1,5 @@
+ /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+-#include <linux/compiler.h>
++#include <linux/compiler_types.h>
+ 
+ #ifndef __always_inline
+ #define __always_inline inline
+diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
+index c48ca2a34b5e..c5ff809e86d0 100644
+--- a/kernel/bpf/verifier.c
++++ b/kernel/bpf/verifier.c
+@@ -1061,6 +1061,11 @@ static int check_ptr_alignment(struct bpf_verifier_env *env,
+ 		break;
+ 	case PTR_TO_STACK:
+ 		pointer_desc = "stack ";
++		/* The stack spill tracking logic in check_stack_write()
++		 * and check_stack_read() relies on stack accesses being
++		 * aligned.
++		 */
++		strict = true;
+ 		break;
+ 	default:
+ 		break;
+@@ -1068,6 +1073,29 @@ static int check_ptr_alignment(struct bpf_verifier_env *env,
+ 	return check_generic_ptr_alignment(reg, pointer_desc, off, size, strict);
+ }
+ 
++/* truncate register to smaller size (in bytes)
++ * must be called with size < BPF_REG_SIZE
++ */
++static void coerce_reg_to_size(struct bpf_reg_state *reg, int size)
++{
++	u64 mask;
++
++	/* clear high bits in bit representation */
++	reg->var_off = tnum_cast(reg->var_off, size);
++
++	/* fix arithmetic bounds */
++	mask = ((u64)1 << (size * 8)) - 1;
++	if ((reg->umin_value & ~mask) == (reg->umax_value & ~mask)) {
++		reg->umin_value &= mask;
++		reg->umax_value &= mask;
++	} else {
++		reg->umin_value = 0;
++		reg->umax_value = mask;
++	}
++	reg->smin_value = reg->umin_value;
++	reg->smax_value = reg->umax_value;
++}
++
+ /* check whether memory at (regno + off) is accessible for t = (read | write)
+  * if t==write, value_regno is a register which value is stored into memory
+  * if t==read, value_regno is a register which will receive the value from memory
+@@ -1200,9 +1228,7 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
+ 	if (!err && size < BPF_REG_SIZE && value_regno >= 0 && t == BPF_READ &&
+ 	    state->regs[value_regno].type == SCALAR_VALUE) {
+ 		/* b/h/w load zero-extends, mark upper bits as known 0 */
+-		state->regs[value_regno].var_off = tnum_cast(
+-					state->regs[value_regno].var_off, size);
+-		__update_reg_bounds(&state->regs[value_regno]);
++		coerce_reg_to_size(&state->regs[value_regno], size);
+ 	}
+ 	return err;
+ }
+@@ -1282,6 +1308,7 @@ static int check_stack_boundary(struct bpf_verifier_env *env, int regno,
+ 		tnum_strn(tn_buf, sizeof(tn_buf), regs[regno].var_off);
+ 		verbose("invalid variable stack read R%d var_off=%s\n",
+ 			regno, tn_buf);
++		return -EACCES;
+ 	}
+ 	off = regs[regno].off + regs[regno].var_off.value;
+ 	if (off >= 0 || off < -MAX_BPF_STACK || off + access_size > 0 ||
+@@ -1742,14 +1769,6 @@ static int check_call(struct bpf_verifier_env *env, int func_id, int insn_idx)
+ 	return 0;
+ }
+ 
+-static void coerce_reg_to_32(struct bpf_reg_state *reg)
+-{
+-	/* clear high 32 bits */
+-	reg->var_off = tnum_cast(reg->var_off, 4);
+-	/* Update bounds */
+-	__update_reg_bounds(reg);
+-}
+-
+ static bool signed_add_overflows(s64 a, s64 b)
+ {
+ 	/* Do the add in u64, where overflow is well-defined */
+@@ -1770,6 +1789,41 @@ static bool signed_sub_overflows(s64 a, s64 b)
+ 	return res > a;
+ }
+ 
++static bool check_reg_sane_offset(struct bpf_verifier_env *env,
++				  const struct bpf_reg_state *reg,
++				  enum bpf_reg_type type)
++{
++	bool known = tnum_is_const(reg->var_off);
++	s64 val = reg->var_off.value;
++	s64 smin = reg->smin_value;
++
++	if (known && (val >= BPF_MAX_VAR_OFF || val <= -BPF_MAX_VAR_OFF)) {
++		verbose("math between %s pointer and %lld is not allowed\n",
++			reg_type_str[type], val);
++		return false;
++	}
++
++	if (reg->off >= BPF_MAX_VAR_OFF || reg->off <= -BPF_MAX_VAR_OFF) {
++		verbose("%s pointer offset %d is not allowed\n",
++			reg_type_str[type], reg->off);
++		return false;
++	}
++
++	if (smin == S64_MIN) {
++		verbose("math between %s pointer and register with unbounded min value is not allowed\n",
++			reg_type_str[type]);
++		return false;
++	}
++
++	if (smin >= BPF_MAX_VAR_OFF || smin <= -BPF_MAX_VAR_OFF) {
++		verbose("value %lld makes %s pointer be out of bounds\n",
++			smin, reg_type_str[type]);
++		return false;
++	}
++
++	return true;
++}
++
+ /* Handles arithmetic on a pointer and a scalar: computes new min/max and var_off.
+  * Caller should also handle BPF_MOV case separately.
+  * If we return -EACCES, caller may want to try again treating pointer as a
+@@ -1835,6 +1889,10 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env,
+ 	dst_reg->type = ptr_reg->type;
+ 	dst_reg->id = ptr_reg->id;
+ 
++	if (!check_reg_sane_offset(env, off_reg, ptr_reg->type) ||
++	    !check_reg_sane_offset(env, ptr_reg, ptr_reg->type))
++		return -EINVAL;
++
+ 	switch (opcode) {
+ 	case BPF_ADD:
+ 		/* We can take a fixed offset as long as it doesn't overflow
+@@ -1965,12 +2023,19 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env,
+ 		return -EACCES;
+ 	}
+ 
++	if (!check_reg_sane_offset(env, dst_reg, ptr_reg->type))
++		return -EINVAL;
++
+ 	__update_reg_bounds(dst_reg);
+ 	__reg_deduce_bounds(dst_reg);
+ 	__reg_bound_offset(dst_reg);
+ 	return 0;
+ }
+ 
++/* WARNING: This function does calculations on 64-bit values, but the actual
++ * execution may occur on 32-bit values. Therefore, things like bitshifts
++ * need extra checks in the 32-bit case.
++ */
+ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env,
+ 				      struct bpf_insn *insn,
+ 				      struct bpf_reg_state *dst_reg,
+@@ -1981,12 +2046,8 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env,
+ 	bool src_known, dst_known;
+ 	s64 smin_val, smax_val;
+ 	u64 umin_val, umax_val;
++	u64 insn_bitness = (BPF_CLASS(insn->code) == BPF_ALU64) ? 64 : 32;
+ 
+-	if (BPF_CLASS(insn->code) != BPF_ALU64) {
+-		/* 32-bit ALU ops are (32,32)->64 */
+-		coerce_reg_to_32(dst_reg);
+-		coerce_reg_to_32(&src_reg);
+-	}
+ 	smin_val = src_reg.smin_value;
+ 	smax_val = src_reg.smax_value;
+ 	umin_val = src_reg.umin_value;
+@@ -1994,6 +2055,12 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env,
+ 	src_known = tnum_is_const(src_reg.var_off);
+ 	dst_known = tnum_is_const(dst_reg->var_off);
+ 
++	if (!src_known &&
++	    opcode != BPF_ADD && opcode != BPF_SUB && opcode != BPF_AND) {
++		__mark_reg_unknown(dst_reg);
++		return 0;
++	}
++
+ 	switch (opcode) {
+ 	case BPF_ADD:
+ 		if (signed_add_overflows(dst_reg->smin_value, smin_val) ||
+@@ -2122,9 +2189,9 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env,
+ 		__update_reg_bounds(dst_reg);
+ 		break;
+ 	case BPF_LSH:
+-		if (umax_val > 63) {
+-			/* Shifts greater than 63 are undefined.  This includes
+-			 * shifts by a negative number.
++		if (umax_val >= insn_bitness) {
++			/* Shifts greater than 31 or 63 are undefined.
++			 * This includes shifts by a negative number.
+ 			 */
+ 			mark_reg_unknown(regs, insn->dst_reg);
+ 			break;
+@@ -2150,27 +2217,29 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env,
+ 		__update_reg_bounds(dst_reg);
+ 		break;
+ 	case BPF_RSH:
+-		if (umax_val > 63) {
+-			/* Shifts greater than 63 are undefined.  This includes
+-			 * shifts by a negative number.
++		if (umax_val >= insn_bitness) {
++			/* Shifts greater than 31 or 63 are undefined.
++			 * This includes shifts by a negative number.
+ 			 */
+ 			mark_reg_unknown(regs, insn->dst_reg);
+ 			break;
+ 		}
+-		/* BPF_RSH is an unsigned shift, so make the appropriate casts */
+-		if (dst_reg->smin_value < 0) {
+-			if (umin_val) {
+-				/* Sign bit will be cleared */
+-				dst_reg->smin_value = 0;
+-			} else {
+-				/* Lost sign bit information */
+-				dst_reg->smin_value = S64_MIN;
+-				dst_reg->smax_value = S64_MAX;
+-			}
+-		} else {
+-			dst_reg->smin_value =
+-				(u64)(dst_reg->smin_value) >> umax_val;
+-		}
++		/* BPF_RSH is an unsigned shift.  If the value in dst_reg might
++		 * be negative, then either:
++		 * 1) src_reg might be zero, so the sign bit of the result is
++		 *    unknown, so we lose our signed bounds
++		 * 2) it's known negative, thus the unsigned bounds capture the
++		 *    signed bounds
++		 * 3) the signed bounds cross zero, so they tell us nothing
++		 *    about the result
++		 * If the value in dst_reg is known nonnegative, then again the
++		 * unsigned bounts capture the signed bounds.
++		 * Thus, in all cases it suffices to blow away our signed bounds
++		 * and rely on inferring new ones from the unsigned bounds and
++		 * var_off of the result.
++		 */
++		dst_reg->smin_value = S64_MIN;
++		dst_reg->smax_value = S64_MAX;
+ 		if (src_known)
+ 			dst_reg->var_off = tnum_rshift(dst_reg->var_off,
+ 						       umin_val);
+@@ -2186,6 +2255,12 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env,
+ 		break;
+ 	}
+ 
++	if (BPF_CLASS(insn->code) != BPF_ALU64) {
++		/* 32-bit ALU ops are (32,32)->32 */
++		coerce_reg_to_size(dst_reg, 4);
++		coerce_reg_to_size(&src_reg, 4);
++	}
++
+ 	__reg_deduce_bounds(dst_reg);
+ 	__reg_bound_offset(dst_reg);
+ 	return 0;
+@@ -2362,17 +2437,20 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
+ 					return -EACCES;
+ 				}
+ 				mark_reg_unknown(regs, insn->dst_reg);
+-				/* high 32 bits are known zero. */
+-				regs[insn->dst_reg].var_off = tnum_cast(
+-						regs[insn->dst_reg].var_off, 4);
+-				__update_reg_bounds(&regs[insn->dst_reg]);
++				coerce_reg_to_size(&regs[insn->dst_reg], 4);
+ 			}
+ 		} else {
+ 			/* case: R = imm
+ 			 * remember the value we stored into this reg
+ 			 */
+ 			regs[insn->dst_reg].type = SCALAR_VALUE;
+-			__mark_reg_known(regs + insn->dst_reg, insn->imm);
++			if (BPF_CLASS(insn->code) == BPF_ALU64) {
++				__mark_reg_known(regs + insn->dst_reg,
++						 insn->imm);
++			} else {
++				__mark_reg_known(regs + insn->dst_reg,
++						 (u32)insn->imm);
++			}
+ 		}
+ 
+ 	} else if (opcode > BPF_END) {
+@@ -3307,15 +3385,14 @@ static bool regsafe(struct bpf_reg_state *rold, struct bpf_reg_state *rcur,
+ 			return range_within(rold, rcur) &&
+ 			       tnum_in(rold->var_off, rcur->var_off);
+ 		} else {
+-			/* if we knew anything about the old value, we're not
+-			 * equal, because we can't know anything about the
+-			 * scalar value of the pointer in the new value.
++			/* We're trying to use a pointer in place of a scalar.
++			 * Even if the scalar was unbounded, this could lead to
++			 * pointer leaks because scalars are allowed to leak
++			 * while pointers are not. We could make this safe in
++			 * special cases if root is calling us, but it's
++			 * probably not worth the hassle.
+ 			 */
+-			return rold->umin_value == 0 &&
+-			       rold->umax_value == U64_MAX &&
+-			       rold->smin_value == S64_MIN &&
+-			       rold->smax_value == S64_MAX &&
+-			       tnum_is_unknown(rold->var_off);
++			return false;
+ 		}
+ 	case PTR_TO_MAP_VALUE:
+ 		/* If the new min/max/var_off satisfy the old ones and
+@@ -3665,6 +3742,7 @@ static int do_check(struct bpf_verifier_env *env)
+ 		if (err)
+ 			return err;
+ 
++		env->insn_aux_data[insn_idx].seen = true;
+ 		if (class == BPF_ALU || class == BPF_ALU64) {
+ 			err = check_alu_op(env, insn);
+ 			if (err)
+@@ -3855,6 +3933,7 @@ static int do_check(struct bpf_verifier_env *env)
+ 					return err;
+ 
+ 				insn_idx++;
++				env->insn_aux_data[insn_idx].seen = true;
+ 			} else {
+ 				verbose("invalid BPF_LD mode\n");
+ 				return -EINVAL;
+@@ -4035,6 +4114,7 @@ static int adjust_insn_aux_data(struct bpf_verifier_env *env, u32 prog_len,
+ 				u32 off, u32 cnt)
+ {
+ 	struct bpf_insn_aux_data *new_data, *old_data = env->insn_aux_data;
++	int i;
+ 
+ 	if (cnt == 1)
+ 		return 0;
+@@ -4044,6 +4124,8 @@ static int adjust_insn_aux_data(struct bpf_verifier_env *env, u32 prog_len,
+ 	memcpy(new_data, old_data, sizeof(struct bpf_insn_aux_data) * off);
+ 	memcpy(new_data + off + cnt - 1, old_data + off,
+ 	       sizeof(struct bpf_insn_aux_data) * (prog_len - off - cnt + 1));
++	for (i = off; i < off + cnt - 1; i++)
++		new_data[i].seen = true;
+ 	env->insn_aux_data = new_data;
+ 	vfree(old_data);
+ 	return 0;
+@@ -4062,6 +4144,25 @@ static struct bpf_prog *bpf_patch_insn_data(struct bpf_verifier_env *env, u32 of
+ 	return new_prog;
+ }
+ 
++/* The verifier does more data flow analysis than llvm and will not explore
++ * branches that are dead at run time. Malicious programs can have dead code
++ * too. Therefore replace all dead at-run-time code with nops.
++ */
++static void sanitize_dead_code(struct bpf_verifier_env *env)
++{
++	struct bpf_insn_aux_data *aux_data = env->insn_aux_data;
++	struct bpf_insn nop = BPF_MOV64_REG(BPF_REG_0, BPF_REG_0);
++	struct bpf_insn *insn = env->prog->insnsi;
++	const int insn_cnt = env->prog->len;
++	int i;
++
++	for (i = 0; i < insn_cnt; i++) {
++		if (aux_data[i].seen)
++			continue;
++		memcpy(insn + i, &nop, sizeof(nop));
++	}
++}
++
+ /* convert load instructions that access fields of 'struct __sk_buff'
+  * into sequence of instructions that access fields of 'struct sk_buff'
+  */
+@@ -4378,6 +4479,9 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr)
+ 	while (pop_stack(env, NULL) >= 0);
+ 	free_states(env);
+ 
++	if (ret == 0)
++		sanitize_dead_code(env);
++
+ 	if (ret == 0)
+ 		/* program is valid, convert *(u32*)(ctx + off) accesses */
+ 		ret = convert_ctx_accesses(env);
+diff --git a/kernel/events/core.c b/kernel/events/core.c
+index 4f1d4bfc607a..24ebad5567b4 100644
+--- a/kernel/events/core.c
++++ b/kernel/events/core.c
+@@ -4233,7 +4233,7 @@ static void perf_remove_from_owner(struct perf_event *event)
+ 	 * indeed free this event, otherwise we need to serialize on
+ 	 * owner->perf_event_mutex.
+ 	 */
+-	owner = lockless_dereference(event->owner);
++	owner = READ_ONCE(event->owner);
+ 	if (owner) {
+ 		/*
+ 		 * Since delayed_put_task_struct() also drops the last
+@@ -4330,7 +4330,7 @@ int perf_event_release_kernel(struct perf_event *event)
+ 		 * Cannot change, child events are not migrated, see the
+ 		 * comment with perf_event_ctx_lock_nested().
+ 		 */
+-		ctx = lockless_dereference(child->ctx);
++		ctx = READ_ONCE(child->ctx);
+ 		/*
+ 		 * Since child_mutex nests inside ctx::mutex, we must jump
+ 		 * through hoops. We start by grabbing a reference on the ctx.
+diff --git a/kernel/seccomp.c b/kernel/seccomp.c
+index 418a1c045933..5f0dfb2abb8d 100644
+--- a/kernel/seccomp.c
++++ b/kernel/seccomp.c
+@@ -190,7 +190,7 @@ static u32 seccomp_run_filters(const struct seccomp_data *sd,
+ 	u32 ret = SECCOMP_RET_ALLOW;
+ 	/* Make sure cross-thread synced filter points somewhere sane. */
+ 	struct seccomp_filter *f =
+-			lockless_dereference(current->seccomp.filter);
++			READ_ONCE(current->seccomp.filter);
+ 
+ 	/* Ensure unexpected behavior doesn't result in failing open. */
+ 	if (unlikely(WARN_ON(f == NULL)))
+diff --git a/kernel/task_work.c b/kernel/task_work.c
+index 5718b3ea202a..0fef395662a6 100644
+--- a/kernel/task_work.c
++++ b/kernel/task_work.c
+@@ -68,7 +68,7 @@ task_work_cancel(struct task_struct *task, task_work_func_t func)
+ 	 * we raced with task_work_run(), *pprev == NULL/exited.
+ 	 */
+ 	raw_spin_lock_irqsave(&task->pi_lock, flags);
+-	while ((work = lockless_dereference(*pprev))) {
++	while ((work = READ_ONCE(*pprev))) {
+ 		if (work->func != func)
+ 			pprev = &work->next;
+ 		else if (cmpxchg(pprev, work, work->next) == work)
+diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
+index dc498b605d5d..6350f64d5aa4 100644
+--- a/kernel/trace/bpf_trace.c
++++ b/kernel/trace/bpf_trace.c
+@@ -293,14 +293,13 @@ static const struct bpf_func_proto bpf_perf_event_read_proto = {
+ 	.arg2_type	= ARG_ANYTHING,
+ };
+ 
+-static DEFINE_PER_CPU(struct perf_sample_data, bpf_sd);
++static DEFINE_PER_CPU(struct perf_sample_data, bpf_trace_sd);
+ 
+ static __always_inline u64
+ __bpf_perf_event_output(struct pt_regs *regs, struct bpf_map *map,
+-			u64 flags, struct perf_raw_record *raw)
++			u64 flags, struct perf_sample_data *sd)
+ {
+ 	struct bpf_array *array = container_of(map, struct bpf_array, map);
+-	struct perf_sample_data *sd = this_cpu_ptr(&bpf_sd);
+ 	unsigned int cpu = smp_processor_id();
+ 	u64 index = flags & BPF_F_INDEX_MASK;
+ 	struct bpf_event_entry *ee;
+@@ -323,8 +322,6 @@ __bpf_perf_event_output(struct pt_regs *regs, struct bpf_map *map,
+ 	if (unlikely(event->oncpu != cpu))
+ 		return -EOPNOTSUPP;
+ 
+-	perf_sample_data_init(sd, 0, 0);
+-	sd->raw = raw;
+ 	perf_event_output(event, sd, regs);
+ 	return 0;
+ }
+@@ -332,6 +329,7 @@ __bpf_perf_event_output(struct pt_regs *regs, struct bpf_map *map,
+ BPF_CALL_5(bpf_perf_event_output, struct pt_regs *, regs, struct bpf_map *, map,
+ 	   u64, flags, void *, data, u64, size)
+ {
++	struct perf_sample_data *sd = this_cpu_ptr(&bpf_trace_sd);
+ 	struct perf_raw_record raw = {
+ 		.frag = {
+ 			.size = size,
+@@ -342,7 +340,10 @@ BPF_CALL_5(bpf_perf_event_output, struct pt_regs *, regs, struct bpf_map *, map,
+ 	if (unlikely(flags & ~(BPF_F_INDEX_MASK)))
+ 		return -EINVAL;
+ 
+-	return __bpf_perf_event_output(regs, map, flags, &raw);
++	perf_sample_data_init(sd, 0, 0);
++	sd->raw = &raw;
++
++	return __bpf_perf_event_output(regs, map, flags, sd);
+ }
+ 
+ static const struct bpf_func_proto bpf_perf_event_output_proto = {
+@@ -357,10 +358,12 @@ static const struct bpf_func_proto bpf_perf_event_output_proto = {
+ };
+ 
+ static DEFINE_PER_CPU(struct pt_regs, bpf_pt_regs);
++static DEFINE_PER_CPU(struct perf_sample_data, bpf_misc_sd);
+ 
+ u64 bpf_event_output(struct bpf_map *map, u64 flags, void *meta, u64 meta_size,
+ 		     void *ctx, u64 ctx_size, bpf_ctx_copy_t ctx_copy)
+ {
++	struct perf_sample_data *sd = this_cpu_ptr(&bpf_misc_sd);
+ 	struct pt_regs *regs = this_cpu_ptr(&bpf_pt_regs);
+ 	struct perf_raw_frag frag = {
+ 		.copy		= ctx_copy,
+@@ -378,8 +381,10 @@ u64 bpf_event_output(struct bpf_map *map, u64 flags, void *meta, u64 meta_size,
+ 	};
+ 
+ 	perf_fetch_caller_regs(regs);
++	perf_sample_data_init(sd, 0, 0);
++	sd->raw = &raw;
+ 
+-	return __bpf_perf_event_output(regs, map, flags, &raw);
++	return __bpf_perf_event_output(regs, map, flags, sd);
+ }
+ 
+ BPF_CALL_0(bpf_get_current_task)
+diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
+index 1c21d0e2a145..7eb975a2d0e1 100644
+--- a/kernel/trace/trace_events_hist.c
++++ b/kernel/trace/trace_events_hist.c
+@@ -450,7 +450,7 @@ static int create_val_field(struct hist_trigger_data *hist_data,
+ 	}
+ 
+ 	field = trace_find_event_field(file->event_call, field_name);
+-	if (!field) {
++	if (!field || !field->size) {
+ 		ret = -EINVAL;
+ 		goto out;
+ 	}
+@@ -548,7 +548,7 @@ static int create_key_field(struct hist_trigger_data *hist_data,
+ 		}
+ 
+ 		field = trace_find_event_field(file->event_call, field_name);
+-		if (!field) {
++		if (!field || !field->size) {
+ 			ret = -EINVAL;
+ 			goto out;
+ 		}
+diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
+index dfdad67d8f6c..ff21b4dbb392 100644
+--- a/lib/Kconfig.debug
++++ b/lib/Kconfig.debug
+@@ -376,7 +376,7 @@ config STACK_VALIDATION
+ 	  that runtime stack traces are more reliable.
+ 
+ 	  This is also a prerequisite for generation of ORC unwind data, which
+-	  is needed for CONFIG_ORC_UNWINDER.
++	  is needed for CONFIG_UNWINDER_ORC.
+ 
+ 	  For more information, see
+ 	  tools/objtool/Documentation/stack-validation.txt.
+diff --git a/mm/slab.h b/mm/slab.h
+index 028cdc7df67e..86d7c7d860f9 100644
+--- a/mm/slab.h
++++ b/mm/slab.h
+@@ -259,7 +259,7 @@ cache_from_memcg_idx(struct kmem_cache *s, int idx)
+ 	 * memcg_caches issues a write barrier to match this (see
+ 	 * memcg_create_kmem_cache()).
+ 	 */
+-	cachep = lockless_dereference(arr->entries[idx]);
++	cachep = READ_ONCE(arr->entries[idx]);
+ 	rcu_read_unlock();
+ 
+ 	return cachep;
+diff --git a/mm/sparse.c b/mm/sparse.c
+index 4900707ae146..60805abf98af 100644
+--- a/mm/sparse.c
++++ b/mm/sparse.c
+@@ -23,8 +23,7 @@
+  * 1) mem_section	- memory sections, mem_map's for valid memory
+  */
+ #ifdef CONFIG_SPARSEMEM_EXTREME
+-struct mem_section *mem_section[NR_SECTION_ROOTS]
+-	____cacheline_internodealigned_in_smp;
++struct mem_section **mem_section;
+ #else
+ struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT]
+ 	____cacheline_internodealigned_in_smp;
+@@ -101,7 +100,7 @@ static inline int sparse_index_init(unsigned long section_nr, int nid)
+ int __section_nr(struct mem_section* ms)
+ {
+ 	unsigned long root_nr;
+-	struct mem_section* root;
++	struct mem_section *root = NULL;
+ 
+ 	for (root_nr = 0; root_nr < NR_SECTION_ROOTS; root_nr++) {
+ 		root = __nr_to_section(root_nr * SECTIONS_PER_ROOT);
+@@ -112,7 +111,7 @@ int __section_nr(struct mem_section* ms)
+ 		     break;
+ 	}
+ 
+-	VM_BUG_ON(root_nr == NR_SECTION_ROOTS);
++	VM_BUG_ON(!root);
+ 
+ 	return (root_nr * SECTIONS_PER_ROOT) + (ms - root);
+ }
+@@ -208,6 +207,16 @@ void __init memory_present(int nid, unsigned long start, unsigned long end)
+ {
+ 	unsigned long pfn;
+ 
++#ifdef CONFIG_SPARSEMEM_EXTREME
++	if (unlikely(!mem_section)) {
++		unsigned long size, align;
++
++		size = sizeof(struct mem_section) * NR_SECTION_ROOTS;
++		align = 1 << (INTERNODE_CACHE_SHIFT);
++		mem_section = memblock_virt_alloc(size, align);
++	}
++#endif
++
+ 	start &= PAGE_SECTION_MASK;
+ 	mminit_validate_memmodel_limits(&start, &end);
+ 	for (pfn = start; pfn < end; pfn += PAGES_PER_SECTION) {
+@@ -330,11 +339,17 @@ sparse_early_usemaps_alloc_pgdat_section(struct pglist_data *pgdat,
+ static void __init check_usemap_section_nr(int nid, unsigned long *usemap)
+ {
+ 	unsigned long usemap_snr, pgdat_snr;
+-	static unsigned long old_usemap_snr = NR_MEM_SECTIONS;
+-	static unsigned long old_pgdat_snr = NR_MEM_SECTIONS;
++	static unsigned long old_usemap_snr;
++	static unsigned long old_pgdat_snr;
+ 	struct pglist_data *pgdat = NODE_DATA(nid);
+ 	int usemap_nid;
+ 
++	/* First call */
++	if (!old_usemap_snr) {
++		old_usemap_snr = NR_MEM_SECTIONS;
++		old_pgdat_snr = NR_MEM_SECTIONS;
++	}
++
+ 	usemap_snr = pfn_to_section_nr(__pa(usemap) >> PAGE_SHIFT);
+ 	pgdat_snr = pfn_to_section_nr(__pa(pgdat) >> PAGE_SHIFT);
+ 	if (usemap_snr == pgdat_snr)
+diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
+index 467e44d7587d..045331204097 100644
+--- a/net/ipv4/ip_gre.c
++++ b/net/ipv4/ip_gre.c
+@@ -579,8 +579,8 @@ static void erspan_fb_xmit(struct sk_buff *skb, struct net_device *dev,
+ 	if (gre_handle_offloads(skb, false))
+ 		goto err_free_rt;
+ 
+-	if (skb->len > dev->mtu) {
+-		pskb_trim(skb, dev->mtu);
++	if (skb->len > dev->mtu + dev->hard_header_len) {
++		pskb_trim(skb, dev->mtu + dev->hard_header_len);
+ 		truncate = true;
+ 	}
+ 
+@@ -731,8 +731,8 @@ static netdev_tx_t erspan_xmit(struct sk_buff *skb,
+ 	if (skb_cow_head(skb, dev->needed_headroom))
+ 		goto free_skb;
+ 
+-	if (skb->len - dev->hard_header_len > dev->mtu) {
+-		pskb_trim(skb, dev->mtu);
++	if (skb->len > dev->mtu + dev->hard_header_len) {
++		pskb_trim(skb, dev->mtu + dev->hard_header_len);
+ 		truncate = true;
+ 	}
+ 
+diff --git a/net/ipv4/tcp_vegas.c b/net/ipv4/tcp_vegas.c
+index 218cfcc77650..ee113ff15fd0 100644
+--- a/net/ipv4/tcp_vegas.c
++++ b/net/ipv4/tcp_vegas.c
+@@ -158,7 +158,7 @@ EXPORT_SYMBOL_GPL(tcp_vegas_cwnd_event);
+ 
+ static inline u32 tcp_vegas_ssthresh(struct tcp_sock *tp)
+ {
+-	return  min(tp->snd_ssthresh, tp->snd_cwnd-1);
++	return  min(tp->snd_ssthresh, tp->snd_cwnd);
+ }
+ 
+ static void tcp_vegas_cong_avoid(struct sock *sk, u32 ack, u32 acked)
+diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
+index 8a1c846d3df9..2ec39404c449 100644
+--- a/net/ipv6/addrconf.c
++++ b/net/ipv6/addrconf.c
+@@ -303,10 +303,10 @@ static struct ipv6_devconf ipv6_devconf_dflt __read_mostly = {
+ 	.disable_policy		= 0,
+ };
+ 
+-/* Check if a valid qdisc is available */
+-static inline bool addrconf_qdisc_ok(const struct net_device *dev)
++/* Check if link is ready: is it up and is a valid qdisc available */
++static inline bool addrconf_link_ready(const struct net_device *dev)
+ {
+-	return !qdisc_tx_is_noop(dev);
++	return netif_oper_up(dev) && !qdisc_tx_is_noop(dev);
+ }
+ 
+ static void addrconf_del_rs_timer(struct inet6_dev *idev)
+@@ -451,7 +451,7 @@ static struct inet6_dev *ipv6_add_dev(struct net_device *dev)
+ 
+ 	ndev->token = in6addr_any;
+ 
+-	if (netif_running(dev) && addrconf_qdisc_ok(dev))
++	if (netif_running(dev) && addrconf_link_ready(dev))
+ 		ndev->if_flags |= IF_READY;
+ 
+ 	ipv6_mc_init_dev(ndev);
+@@ -3404,7 +3404,7 @@ static int addrconf_notify(struct notifier_block *this, unsigned long event,
+ 			/* restore routes for permanent addresses */
+ 			addrconf_permanent_addr(dev);
+ 
+-			if (!addrconf_qdisc_ok(dev)) {
++			if (!addrconf_link_ready(dev)) {
+ 				/* device is not ready yet. */
+ 				pr_info("ADDRCONF(NETDEV_UP): %s: link is not ready\n",
+ 					dev->name);
+@@ -3419,7 +3419,7 @@ static int addrconf_notify(struct notifier_block *this, unsigned long event,
+ 				run_pending = 1;
+ 			}
+ 		} else if (event == NETDEV_CHANGE) {
+-			if (!addrconf_qdisc_ok(dev)) {
++			if (!addrconf_link_ready(dev)) {
+ 				/* device is still not ready. */
+ 				break;
+ 			}
+diff --git a/net/ipv6/route.c b/net/ipv6/route.c
+index 598efa8cfe25..76b47682f77f 100644
+--- a/net/ipv6/route.c
++++ b/net/ipv6/route.c
+@@ -1055,7 +1055,6 @@ static struct rt6_info *rt6_get_pcpu_route(struct rt6_info *rt)
+ 
+ static struct rt6_info *rt6_make_pcpu_route(struct rt6_info *rt)
+ {
+-	struct fib6_table *table = rt->rt6i_table;
+ 	struct rt6_info *pcpu_rt, *prev, **p;
+ 
+ 	pcpu_rt = ip6_rt_pcpu_alloc(rt);
+@@ -1066,28 +1065,20 @@ static struct rt6_info *rt6_make_pcpu_route(struct rt6_info *rt)
+ 		return net->ipv6.ip6_null_entry;
+ 	}
+ 
+-	read_lock_bh(&table->tb6_lock);
+-	if (rt->rt6i_pcpu) {
+-		p = this_cpu_ptr(rt->rt6i_pcpu);
+-		prev = cmpxchg(p, NULL, pcpu_rt);
+-		if (prev) {
+-			/* If someone did it before us, return prev instead */
+-			dst_release_immediate(&pcpu_rt->dst);
+-			pcpu_rt = prev;
+-		}
+-	} else {
+-		/* rt has been removed from the fib6 tree
+-		 * before we have a chance to acquire the read_lock.
+-		 * In this case, don't brother to create a pcpu rt
+-		 * since rt is going away anyway.  The next
+-		 * dst_check() will trigger a re-lookup.
+-		 */
++	dst_hold(&pcpu_rt->dst);
++	p = this_cpu_ptr(rt->rt6i_pcpu);
++	prev = cmpxchg(p, NULL, pcpu_rt);
++	if (prev) {
++		/* If someone did it before us, return prev instead */
++		/* release refcnt taken by ip6_rt_pcpu_alloc() */
+ 		dst_release_immediate(&pcpu_rt->dst);
+-		pcpu_rt = rt;
++		/* release refcnt taken by above dst_hold() */
++		dst_release_immediate(&pcpu_rt->dst);
++		dst_hold(&prev->dst);
++		pcpu_rt = prev;
+ 	}
+-	dst_hold(&pcpu_rt->dst);
++
+ 	rt6_dst_from_metrics_check(pcpu_rt);
+-	read_unlock_bh(&table->tb6_lock);
+ 	return pcpu_rt;
+ }
+ 
+@@ -1177,19 +1168,28 @@ struct rt6_info *ip6_pol_route(struct net *net, struct fib6_table *table,
+ 		if (pcpu_rt) {
+ 			read_unlock_bh(&table->tb6_lock);
+ 		} else {
+-			/* We have to do the read_unlock first
+-			 * because rt6_make_pcpu_route() may trigger
+-			 * ip6_dst_gc() which will take the write_lock.
+-			 */
+-			dst_hold(&rt->dst);
+-			read_unlock_bh(&table->tb6_lock);
+-			pcpu_rt = rt6_make_pcpu_route(rt);
+-			dst_release(&rt->dst);
++			/* atomic_inc_not_zero() is needed when using rcu */
++			if (atomic_inc_not_zero(&rt->rt6i_ref)) {
++				/* We have to do the read_unlock first
++				 * because rt6_make_pcpu_route() may trigger
++				 * ip6_dst_gc() which will take the write_lock.
++				 *
++				 * No dst_hold() on rt is needed because grabbing
++				 * rt->rt6i_ref makes sure rt can't be released.
++				 */
++				read_unlock_bh(&table->tb6_lock);
++				pcpu_rt = rt6_make_pcpu_route(rt);
++				rt6_release(rt);
++			} else {
++				/* rt is already removed from tree */
++				read_unlock_bh(&table->tb6_lock);
++				pcpu_rt = net->ipv6.ip6_null_entry;
++				dst_hold(&pcpu_rt->dst);
++			}
+ 		}
+ 
+ 		trace_fib6_table_lookup(net, pcpu_rt, table->tb6_id, fl6);
+ 		return pcpu_rt;
+-
+ 	}
+ }
+ EXPORT_SYMBOL_GPL(ip6_pol_route);
+diff --git a/net/sctp/stream.c b/net/sctp/stream.c
+index fa8371ff05c4..724adf2786a2 100644
+--- a/net/sctp/stream.c
++++ b/net/sctp/stream.c
+@@ -40,9 +40,14 @@ int sctp_stream_init(struct sctp_stream *stream, __u16 outcnt, __u16 incnt,
+ {
+ 	int i;
+ 
++	gfp |= __GFP_NOWARN;
++
+ 	/* Initial stream->out size may be very big, so free it and alloc
+-	 * a new one with new outcnt to save memory.
++	 * a new one with new outcnt to save memory if needed.
+ 	 */
++	if (outcnt == stream->outcnt)
++		goto in;
++
+ 	kfree(stream->out);
+ 
+ 	stream->out = kcalloc(outcnt, sizeof(*stream->out), gfp);
+@@ -53,6 +58,7 @@ int sctp_stream_init(struct sctp_stream *stream, __u16 outcnt, __u16 incnt,
+ 	for (i = 0; i < stream->outcnt; i++)
+ 		stream->out[i].state = SCTP_STREAM_OPEN;
+ 
++in:
+ 	if (!incnt)
+ 		return 0;
+ 
+diff --git a/scripts/Makefile.build b/scripts/Makefile.build
+index bb831d49bcfd..e63af4e19382 100644
+--- a/scripts/Makefile.build
++++ b/scripts/Makefile.build
+@@ -259,7 +259,7 @@ ifneq ($(SKIP_STACK_VALIDATION),1)
+ 
+ __objtool_obj := $(objtree)/tools/objtool/objtool
+ 
+-objtool_args = $(if $(CONFIG_ORC_UNWINDER),orc generate,check)
++objtool_args = $(if $(CONFIG_UNWINDER_ORC),orc generate,check)
+ 
+ ifndef CONFIG_FRAME_POINTER
+ objtool_args += --no-fp
+diff --git a/scripts/headers_install.sh b/scripts/headers_install.sh
+index 4d1ea96e8794..a18bca720995 100755
+--- a/scripts/headers_install.sh
++++ b/scripts/headers_install.sh
+@@ -34,7 +34,7 @@ do
+ 	sed -r \
+ 		-e 's/([ \t(])(__user|__force|__iomem)[ \t]/\1/g' \
+ 		-e 's/__attribute_const__([ \t]|$)/\1/g' \
+-		-e 's@^#include <linux/compiler.h>@@' \
++		-e 's@^#include <linux/compiler(|_types).h>@@' \
+ 		-e 's/(^|[^a-zA-Z0-9])__packed([^a-zA-Z0-9_]|$)/\1__attribute__((packed))\2/g' \
+ 		-e 's/(^|[ \t(])(inline|asm|volatile)([ \t(]|$)/\1__\2__\3/g' \
+ 		-e 's@#(ifndef|define|endif[ \t]*/[*])[ \t]*_UAPI@#\1 @' \
+diff --git a/sound/soc/codecs/msm8916-wcd-analog.c b/sound/soc/codecs/msm8916-wcd-analog.c
+index 549c269acc7d..18933bf6473f 100644
+--- a/sound/soc/codecs/msm8916-wcd-analog.c
++++ b/sound/soc/codecs/msm8916-wcd-analog.c
+@@ -104,7 +104,7 @@
+ #define CDC_A_MICB_1_VAL		(0xf141)
+ #define MICB_MIN_VAL 1600
+ #define MICB_STEP_SIZE 50
+-#define MICB_VOLTAGE_REGVAL(v)		((v - MICB_MIN_VAL)/MICB_STEP_SIZE)
++#define MICB_VOLTAGE_REGVAL(v)		(((v - MICB_MIN_VAL)/MICB_STEP_SIZE) << 3)
+ #define MICB_1_VAL_MICB_OUT_VAL_MASK	GENMASK(7, 3)
+ #define MICB_1_VAL_MICB_OUT_VAL_V2P70V	((0x16)  << 3)
+ #define MICB_1_VAL_MICB_OUT_VAL_V1P80V	((0x4)  << 3)
+@@ -349,8 +349,9 @@ static void pm8916_wcd_analog_micbias_enable(struct snd_soc_codec *codec)
+ 			    | MICB_1_CTL_EXT_PRECHARG_EN_ENABLE);
+ 
+ 	if (wcd->micbias_mv) {
+-		snd_soc_write(codec, CDC_A_MICB_1_VAL,
+-			      MICB_VOLTAGE_REGVAL(wcd->micbias_mv));
++		snd_soc_update_bits(codec, CDC_A_MICB_1_VAL,
++				    MICB_1_VAL_MICB_OUT_VAL_MASK,
++				    MICB_VOLTAGE_REGVAL(wcd->micbias_mv));
+ 		/*
+ 		 * Special headset needs MICBIAS as 2.7V so wait for
+ 		 * 50 msec for the MICBIAS to reach 2.7 volts.
+@@ -1241,6 +1242,8 @@ static const struct of_device_id pm8916_wcd_analog_spmi_match_table[] = {
+ 	{ }
+ };
+ 
++MODULE_DEVICE_TABLE(of, pm8916_wcd_analog_spmi_match_table);
++
+ static struct platform_driver pm8916_wcd_analog_spmi_driver = {
+ 	.driver = {
+ 		   .name = "qcom,pm8916-wcd-spmi-codec",
+diff --git a/sound/soc/img/img-parallel-out.c b/sound/soc/img/img-parallel-out.c
+index 23b0f0f6ec9c..2fc8a6372206 100644
+--- a/sound/soc/img/img-parallel-out.c
++++ b/sound/soc/img/img-parallel-out.c
+@@ -164,9 +164,11 @@ static int img_prl_out_set_fmt(struct snd_soc_dai *dai, unsigned int fmt)
+ 		return -EINVAL;
+ 	}
+ 
++	pm_runtime_get_sync(prl->dev);
+ 	reg = img_prl_out_readl(prl, IMG_PRL_OUT_CTL);
+ 	reg = (reg & ~IMG_PRL_OUT_CTL_EDGE_MASK) | control_set;
+ 	img_prl_out_writel(prl, reg, IMG_PRL_OUT_CTL);
++	pm_runtime_put(prl->dev);
+ 
+ 	return 0;
+ }
+diff --git a/tools/objtool/check.c b/tools/objtool/check.c
+index c0e26ad1fa7e..9b341584eb1b 100644
+--- a/tools/objtool/check.c
++++ b/tools/objtool/check.c
+@@ -1757,11 +1757,14 @@ static int validate_branch(struct objtool_file *file, struct instruction *first,
+ 		if (insn->dead_end)
+ 			return 0;
+ 
+-		insn = next_insn;
+-		if (!insn) {
++		if (!next_insn) {
++			if (state.cfa.base == CFI_UNDEFINED)
++				return 0;
+ 			WARN("%s: unexpected end of section", sec->name);
+ 			return 1;
+ 		}
++
++		insn = next_insn;
+ 	}
+ 
+ 	return 0;
+diff --git a/tools/objtool/objtool.c b/tools/objtool/objtool.c
+index 31e0f9143840..07f329919828 100644
+--- a/tools/objtool/objtool.c
++++ b/tools/objtool/objtool.c
+@@ -70,7 +70,7 @@ static void cmd_usage(void)
+ 
+ 	printf("\n");
+ 
+-	exit(1);
++	exit(129);
+ }
+ 
+ static void handle_options(int *argc, const char ***argv)
+@@ -86,9 +86,7 @@ static void handle_options(int *argc, const char ***argv)
+ 			break;
+ 		} else {
+ 			fprintf(stderr, "Unknown option: %s\n", cmd);
+-			fprintf(stderr, "\n Usage: %s\n",
+-				objtool_usage_string);
+-			exit(1);
++			cmd_usage();
+ 		}
+ 
+ 		(*argv)++;
+diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
+index 64ae21f64489..7a2d221c4702 100644
+--- a/tools/testing/selftests/bpf/test_verifier.c
++++ b/tools/testing/selftests/bpf/test_verifier.c
+@@ -606,7 +606,6 @@ static struct bpf_test tests[] = {
+ 		},
+ 		.errstr = "misaligned stack access",
+ 		.result = REJECT,
+-		.flags = F_LOAD_WITH_STRICT_ALIGNMENT,
+ 	},
+ 	{
+ 		"invalid map_fd for function call",
+@@ -1797,7 +1796,6 @@ static struct bpf_test tests[] = {
+ 		},
+ 		.result = REJECT,
+ 		.errstr = "misaligned stack access off (0x0; 0x0)+-8+2 size 8",
+-		.flags = F_LOAD_WITH_STRICT_ALIGNMENT,
+ 	},
+ 	{
+ 		"PTR_TO_STACK store/load - bad alignment on reg",
+@@ -1810,7 +1808,6 @@ static struct bpf_test tests[] = {
+ 		},
+ 		.result = REJECT,
+ 		.errstr = "misaligned stack access off (0x0; 0x0)+-10+8 size 8",
+-		.flags = F_LOAD_WITH_STRICT_ALIGNMENT,
+ 	},
+ 	{
+ 		"PTR_TO_STACK store/load - out of bounds low",
+@@ -6115,7 +6112,7 @@ static struct bpf_test tests[] = {
+ 			BPF_EXIT_INSN(),
+ 		},
+ 		.fixup_map1 = { 3 },
+-		.errstr = "R0 min value is negative",
++		.errstr = "unbounded min value",
+ 		.result = REJECT,
+ 	},
+ 	{
+@@ -6139,7 +6136,7 @@ static struct bpf_test tests[] = {
+ 			BPF_EXIT_INSN(),
+ 		},
+ 		.fixup_map1 = { 3 },
+-		.errstr = "R0 min value is negative",
++		.errstr = "unbounded min value",
+ 		.result = REJECT,
+ 	},
+ 	{
+@@ -6165,7 +6162,7 @@ static struct bpf_test tests[] = {
+ 			BPF_EXIT_INSN(),
+ 		},
+ 		.fixup_map1 = { 3 },
+-		.errstr = "R8 invalid mem access 'inv'",
++		.errstr = "unbounded min value",
+ 		.result = REJECT,
+ 	},
+ 	{
+@@ -6190,7 +6187,7 @@ static struct bpf_test tests[] = {
+ 			BPF_EXIT_INSN(),
+ 		},
+ 		.fixup_map1 = { 3 },
+-		.errstr = "R8 invalid mem access 'inv'",
++		.errstr = "unbounded min value",
+ 		.result = REJECT,
+ 	},
+ 	{
+@@ -6238,7 +6235,7 @@ static struct bpf_test tests[] = {
+ 			BPF_EXIT_INSN(),
+ 		},
+ 		.fixup_map1 = { 3 },
+-		.errstr = "R0 min value is negative",
++		.errstr = "unbounded min value",
+ 		.result = REJECT,
+ 	},
+ 	{
+@@ -6309,7 +6306,7 @@ static struct bpf_test tests[] = {
+ 			BPF_EXIT_INSN(),
+ 		},
+ 		.fixup_map1 = { 3 },
+-		.errstr = "R0 min value is negative",
++		.errstr = "unbounded min value",
+ 		.result = REJECT,
+ 	},
+ 	{
+@@ -6360,7 +6357,7 @@ static struct bpf_test tests[] = {
+ 			BPF_EXIT_INSN(),
+ 		},
+ 		.fixup_map1 = { 3 },
+-		.errstr = "R0 min value is negative",
++		.errstr = "unbounded min value",
+ 		.result = REJECT,
+ 	},
+ 	{
+@@ -6387,7 +6384,7 @@ static struct bpf_test tests[] = {
+ 			BPF_EXIT_INSN(),
+ 		},
+ 		.fixup_map1 = { 3 },
+-		.errstr = "R0 min value is negative",
++		.errstr = "unbounded min value",
+ 		.result = REJECT,
+ 	},
+ 	{
+@@ -6413,7 +6410,7 @@ static struct bpf_test tests[] = {
+ 			BPF_EXIT_INSN(),
+ 		},
+ 		.fixup_map1 = { 3 },
+-		.errstr = "R0 min value is negative",
++		.errstr = "unbounded min value",
+ 		.result = REJECT,
+ 	},
+ 	{
+@@ -6442,7 +6439,7 @@ static struct bpf_test tests[] = {
+ 			BPF_EXIT_INSN(),
+ 		},
+ 		.fixup_map1 = { 3 },
+-		.errstr = "R0 min value is negative",
++		.errstr = "unbounded min value",
+ 		.result = REJECT,
+ 	},
+ 	{
+@@ -6472,7 +6469,7 @@ static struct bpf_test tests[] = {
+ 			BPF_JMP_IMM(BPF_JA, 0, 0, -7),
+ 		},
+ 		.fixup_map1 = { 4 },
+-		.errstr = "R0 min value is negative",
++		.errstr = "unbounded min value",
+ 		.result = REJECT,
+ 	},
+ 	{
+@@ -6500,8 +6497,7 @@ static struct bpf_test tests[] = {
+ 			BPF_EXIT_INSN(),
+ 		},
+ 		.fixup_map1 = { 3 },
+-		.errstr_unpriv = "R0 pointer comparison prohibited",
+-		.errstr = "R0 min value is negative",
++		.errstr = "unbounded min value",
+ 		.result = REJECT,
+ 		.result_unpriv = REJECT,
+ 	},
+@@ -6556,6 +6552,462 @@ static struct bpf_test tests[] = {
+ 		.errstr = "R0 min value is negative, either use unsigned index or do a if (index >=0) check.",
+ 		.result = REJECT,
+ 	},
++	{
++		"bounds check based on zero-extended MOV",
++		.insns = {
++			BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
++			BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
++			BPF_LD_MAP_FD(BPF_REG_1, 0),
++			BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
++				     BPF_FUNC_map_lookup_elem),
++			BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 4),
++			/* r2 = 0x0000'0000'ffff'ffff */
++			BPF_MOV32_IMM(BPF_REG_2, 0xffffffff),
++			/* r2 = 0 */
++			BPF_ALU64_IMM(BPF_RSH, BPF_REG_2, 32),
++			/* no-op */
++			BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_2),
++			/* access at offset 0 */
++			BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_0, 0),
++			/* exit */
++			BPF_MOV64_IMM(BPF_REG_0, 0),
++			BPF_EXIT_INSN(),
++		},
++		.fixup_map1 = { 3 },
++		.result = ACCEPT
++	},
++	{
++		"bounds check based on sign-extended MOV. test1",
++		.insns = {
++			BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
++			BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
++			BPF_LD_MAP_FD(BPF_REG_1, 0),
++			BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
++				     BPF_FUNC_map_lookup_elem),
++			BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 4),
++			/* r2 = 0xffff'ffff'ffff'ffff */
++			BPF_MOV64_IMM(BPF_REG_2, 0xffffffff),
++			/* r2 = 0xffff'ffff */
++			BPF_ALU64_IMM(BPF_RSH, BPF_REG_2, 32),
++			/* r0 = <oob pointer> */
++			BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_2),
++			/* access to OOB pointer */
++			BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_0, 0),
++			/* exit */
++			BPF_MOV64_IMM(BPF_REG_0, 0),
++			BPF_EXIT_INSN(),
++		},
++		.fixup_map1 = { 3 },
++		.errstr = "map_value pointer and 4294967295",
++		.result = REJECT
++	},
++	{
++		"bounds check based on sign-extended MOV. test2",
++		.insns = {
++			BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
++			BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
++			BPF_LD_MAP_FD(BPF_REG_1, 0),
++			BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
++				     BPF_FUNC_map_lookup_elem),
++			BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 4),
++			/* r2 = 0xffff'ffff'ffff'ffff */
++			BPF_MOV64_IMM(BPF_REG_2, 0xffffffff),
++			/* r2 = 0xfff'ffff */
++			BPF_ALU64_IMM(BPF_RSH, BPF_REG_2, 36),
++			/* r0 = <oob pointer> */
++			BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_2),
++			/* access to OOB pointer */
++			BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_0, 0),
++			/* exit */
++			BPF_MOV64_IMM(BPF_REG_0, 0),
++			BPF_EXIT_INSN(),
++		},
++		.fixup_map1 = { 3 },
++		.errstr = "R0 min value is outside of the array range",
++		.result = REJECT
++	},
++	{
++		"bounds check based on reg_off + var_off + insn_off. test1",
++		.insns = {
++			BPF_LDX_MEM(BPF_W, BPF_REG_6, BPF_REG_1,
++				    offsetof(struct __sk_buff, mark)),
++			BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
++			BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
++			BPF_LD_MAP_FD(BPF_REG_1, 0),
++			BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
++				     BPF_FUNC_map_lookup_elem),
++			BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 4),
++			BPF_ALU64_IMM(BPF_AND, BPF_REG_6, 1),
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_6, (1 << 29) - 1),
++			BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_6),
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_0, (1 << 29) - 1),
++			BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_0, 3),
++			BPF_MOV64_IMM(BPF_REG_0, 0),
++			BPF_EXIT_INSN(),
++		},
++		.fixup_map1 = { 4 },
++		.errstr = "value_size=8 off=1073741825",
++		.result = REJECT,
++		.prog_type = BPF_PROG_TYPE_SCHED_CLS,
++	},
++	{
++		"bounds check based on reg_off + var_off + insn_off. test2",
++		.insns = {
++			BPF_LDX_MEM(BPF_W, BPF_REG_6, BPF_REG_1,
++				    offsetof(struct __sk_buff, mark)),
++			BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
++			BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
++			BPF_LD_MAP_FD(BPF_REG_1, 0),
++			BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
++				     BPF_FUNC_map_lookup_elem),
++			BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 4),
++			BPF_ALU64_IMM(BPF_AND, BPF_REG_6, 1),
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_6, (1 << 30) - 1),
++			BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_6),
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_0, (1 << 29) - 1),
++			BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_0, 3),
++			BPF_MOV64_IMM(BPF_REG_0, 0),
++			BPF_EXIT_INSN(),
++		},
++		.fixup_map1 = { 4 },
++		.errstr = "value 1073741823",
++		.result = REJECT,
++		.prog_type = BPF_PROG_TYPE_SCHED_CLS,
++	},
++	{
++		"bounds check after truncation of non-boundary-crossing range",
++		.insns = {
++			BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
++			BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
++			BPF_LD_MAP_FD(BPF_REG_1, 0),
++			BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
++				     BPF_FUNC_map_lookup_elem),
++			BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 9),
++			/* r1 = [0x00, 0xff] */
++			BPF_LDX_MEM(BPF_B, BPF_REG_1, BPF_REG_0, 0),
++			BPF_MOV64_IMM(BPF_REG_2, 1),
++			/* r2 = 0x10'0000'0000 */
++			BPF_ALU64_IMM(BPF_LSH, BPF_REG_2, 36),
++			/* r1 = [0x10'0000'0000, 0x10'0000'00ff] */
++			BPF_ALU64_REG(BPF_ADD, BPF_REG_1, BPF_REG_2),
++			/* r1 = [0x10'7fff'ffff, 0x10'8000'00fe] */
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 0x7fffffff),
++			/* r1 = [0x00, 0xff] */
++			BPF_ALU32_IMM(BPF_SUB, BPF_REG_1, 0x7fffffff),
++			/* r1 = 0 */
++			BPF_ALU64_IMM(BPF_RSH, BPF_REG_1, 8),
++			/* no-op */
++			BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_1),
++			/* access at offset 0 */
++			BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_0, 0),
++			/* exit */
++			BPF_MOV64_IMM(BPF_REG_0, 0),
++			BPF_EXIT_INSN(),
++		},
++		.fixup_map1 = { 3 },
++		.result = ACCEPT
++	},
++	{
++		"bounds check after truncation of boundary-crossing range (1)",
++		.insns = {
++			BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
++			BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
++			BPF_LD_MAP_FD(BPF_REG_1, 0),
++			BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
++				     BPF_FUNC_map_lookup_elem),
++			BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 9),
++			/* r1 = [0x00, 0xff] */
++			BPF_LDX_MEM(BPF_B, BPF_REG_1, BPF_REG_0, 0),
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 0xffffff80 >> 1),
++			/* r1 = [0xffff'ff80, 0x1'0000'007f] */
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 0xffffff80 >> 1),
++			/* r1 = [0xffff'ff80, 0xffff'ffff] or
++			 *      [0x0000'0000, 0x0000'007f]
++			 */
++			BPF_ALU32_IMM(BPF_ADD, BPF_REG_1, 0),
++			BPF_ALU64_IMM(BPF_SUB, BPF_REG_1, 0xffffff80 >> 1),
++			/* r1 = [0x00, 0xff] or
++			 *      [0xffff'ffff'0000'0080, 0xffff'ffff'ffff'ffff]
++			 */
++			BPF_ALU64_IMM(BPF_SUB, BPF_REG_1, 0xffffff80 >> 1),
++			/* r1 = 0 or
++			 *      [0x00ff'ffff'ff00'0000, 0x00ff'ffff'ffff'ffff]
++			 */
++			BPF_ALU64_IMM(BPF_RSH, BPF_REG_1, 8),
++			/* no-op or OOB pointer computation */
++			BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_1),
++			/* potentially OOB access */
++			BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_0, 0),
++			/* exit */
++			BPF_MOV64_IMM(BPF_REG_0, 0),
++			BPF_EXIT_INSN(),
++		},
++		.fixup_map1 = { 3 },
++		/* not actually fully unbounded, but the bound is very high */
++		.errstr = "R0 unbounded memory access",
++		.result = REJECT
++	},
++	{
++		"bounds check after truncation of boundary-crossing range (2)",
++		.insns = {
++			BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
++			BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
++			BPF_LD_MAP_FD(BPF_REG_1, 0),
++			BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
++				     BPF_FUNC_map_lookup_elem),
++			BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 9),
++			/* r1 = [0x00, 0xff] */
++			BPF_LDX_MEM(BPF_B, BPF_REG_1, BPF_REG_0, 0),
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 0xffffff80 >> 1),
++			/* r1 = [0xffff'ff80, 0x1'0000'007f] */
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 0xffffff80 >> 1),
++			/* r1 = [0xffff'ff80, 0xffff'ffff] or
++			 *      [0x0000'0000, 0x0000'007f]
++			 * difference to previous test: truncation via MOV32
++			 * instead of ALU32.
++			 */
++			BPF_MOV32_REG(BPF_REG_1, BPF_REG_1),
++			BPF_ALU64_IMM(BPF_SUB, BPF_REG_1, 0xffffff80 >> 1),
++			/* r1 = [0x00, 0xff] or
++			 *      [0xffff'ffff'0000'0080, 0xffff'ffff'ffff'ffff]
++			 */
++			BPF_ALU64_IMM(BPF_SUB, BPF_REG_1, 0xffffff80 >> 1),
++			/* r1 = 0 or
++			 *      [0x00ff'ffff'ff00'0000, 0x00ff'ffff'ffff'ffff]
++			 */
++			BPF_ALU64_IMM(BPF_RSH, BPF_REG_1, 8),
++			/* no-op or OOB pointer computation */
++			BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_1),
++			/* potentially OOB access */
++			BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_0, 0),
++			/* exit */
++			BPF_MOV64_IMM(BPF_REG_0, 0),
++			BPF_EXIT_INSN(),
++		},
++		.fixup_map1 = { 3 },
++		/* not actually fully unbounded, but the bound is very high */
++		.errstr = "R0 unbounded memory access",
++		.result = REJECT
++	},
++	{
++		"bounds check after wrapping 32-bit addition",
++		.insns = {
++			BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
++			BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
++			BPF_LD_MAP_FD(BPF_REG_1, 0),
++			BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
++				     BPF_FUNC_map_lookup_elem),
++			BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 5),
++			/* r1 = 0x7fff'ffff */
++			BPF_MOV64_IMM(BPF_REG_1, 0x7fffffff),
++			/* r1 = 0xffff'fffe */
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 0x7fffffff),
++			/* r1 = 0 */
++			BPF_ALU32_IMM(BPF_ADD, BPF_REG_1, 2),
++			/* no-op */
++			BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_1),
++			/* access at offset 0 */
++			BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_0, 0),
++			/* exit */
++			BPF_MOV64_IMM(BPF_REG_0, 0),
++			BPF_EXIT_INSN(),
++		},
++		.fixup_map1 = { 3 },
++		.result = ACCEPT
++	},
++	{
++		"bounds check after shift with oversized count operand",
++		.insns = {
++			BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
++			BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
++			BPF_LD_MAP_FD(BPF_REG_1, 0),
++			BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
++				     BPF_FUNC_map_lookup_elem),
++			BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 6),
++			BPF_MOV64_IMM(BPF_REG_2, 32),
++			BPF_MOV64_IMM(BPF_REG_1, 1),
++			/* r1 = (u32)1 << (u32)32 = ? */
++			BPF_ALU32_REG(BPF_LSH, BPF_REG_1, BPF_REG_2),
++			/* r1 = [0x0000, 0xffff] */
++			BPF_ALU64_IMM(BPF_AND, BPF_REG_1, 0xffff),
++			/* computes unknown pointer, potentially OOB */
++			BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_1),
++			/* potentially OOB access */
++			BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_0, 0),
++			/* exit */
++			BPF_MOV64_IMM(BPF_REG_0, 0),
++			BPF_EXIT_INSN(),
++		},
++		.fixup_map1 = { 3 },
++		.errstr = "R0 max value is outside of the array range",
++		.result = REJECT
++	},
++	{
++		"bounds check after right shift of maybe-negative number",
++		.insns = {
++			BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
++			BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
++			BPF_LD_MAP_FD(BPF_REG_1, 0),
++			BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
++				     BPF_FUNC_map_lookup_elem),
++			BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 6),
++			/* r1 = [0x00, 0xff] */
++			BPF_LDX_MEM(BPF_B, BPF_REG_1, BPF_REG_0, 0),
++			/* r1 = [-0x01, 0xfe] */
++			BPF_ALU64_IMM(BPF_SUB, BPF_REG_1, 1),
++			/* r1 = 0 or 0xff'ffff'ffff'ffff */
++			BPF_ALU64_IMM(BPF_RSH, BPF_REG_1, 8),
++			/* r1 = 0 or 0xffff'ffff'ffff */
++			BPF_ALU64_IMM(BPF_RSH, BPF_REG_1, 8),
++			/* computes unknown pointer, potentially OOB */
++			BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_1),
++			/* potentially OOB access */
++			BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_0, 0),
++			/* exit */
++			BPF_MOV64_IMM(BPF_REG_0, 0),
++			BPF_EXIT_INSN(),
++		},
++		.fixup_map1 = { 3 },
++		.errstr = "R0 unbounded memory access",
++		.result = REJECT
++	},
++	{
++		"bounds check map access with off+size signed 32bit overflow. test1",
++		.insns = {
++			BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
++			BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
++			BPF_LD_MAP_FD(BPF_REG_1, 0),
++			BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
++				     BPF_FUNC_map_lookup_elem),
++			BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1),
++			BPF_EXIT_INSN(),
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_0, 0x7ffffffe),
++			BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 0),
++			BPF_JMP_A(0),
++			BPF_EXIT_INSN(),
++		},
++		.fixup_map1 = { 3 },
++		.errstr = "map_value pointer and 2147483646",
++		.result = REJECT
++	},
++	{
++		"bounds check map access with off+size signed 32bit overflow. test2",
++		.insns = {
++			BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
++			BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
++			BPF_LD_MAP_FD(BPF_REG_1, 0),
++			BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
++				     BPF_FUNC_map_lookup_elem),
++			BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1),
++			BPF_EXIT_INSN(),
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_0, 0x1fffffff),
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_0, 0x1fffffff),
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_0, 0x1fffffff),
++			BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 0),
++			BPF_JMP_A(0),
++			BPF_EXIT_INSN(),
++		},
++		.fixup_map1 = { 3 },
++		.errstr = "pointer offset 1073741822",
++		.result = REJECT
++	},
++	{
++		"bounds check map access with off+size signed 32bit overflow. test3",
++		.insns = {
++			BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
++			BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
++			BPF_LD_MAP_FD(BPF_REG_1, 0),
++			BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
++				     BPF_FUNC_map_lookup_elem),
++			BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1),
++			BPF_EXIT_INSN(),
++			BPF_ALU64_IMM(BPF_SUB, BPF_REG_0, 0x1fffffff),
++			BPF_ALU64_IMM(BPF_SUB, BPF_REG_0, 0x1fffffff),
++			BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 2),
++			BPF_JMP_A(0),
++			BPF_EXIT_INSN(),
++		},
++		.fixup_map1 = { 3 },
++		.errstr = "pointer offset -1073741822",
++		.result = REJECT
++	},
++	{
++		"bounds check map access with off+size signed 32bit overflow. test4",
++		.insns = {
++			BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
++			BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
++			BPF_LD_MAP_FD(BPF_REG_1, 0),
++			BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
++				     BPF_FUNC_map_lookup_elem),
++			BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1),
++			BPF_EXIT_INSN(),
++			BPF_MOV64_IMM(BPF_REG_1, 1000000),
++			BPF_ALU64_IMM(BPF_MUL, BPF_REG_1, 1000000),
++			BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_1),
++			BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 2),
++			BPF_JMP_A(0),
++			BPF_EXIT_INSN(),
++		},
++		.fixup_map1 = { 3 },
++		.errstr = "map_value pointer and 1000000000000",
++		.result = REJECT
++	},
++	{
++		"pointer/scalar confusion in state equality check (way 1)",
++		.insns = {
++			BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
++			BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
++			BPF_LD_MAP_FD(BPF_REG_1, 0),
++			BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
++				     BPF_FUNC_map_lookup_elem),
++			BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 2),
++			BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 0),
++			BPF_JMP_A(1),
++			BPF_MOV64_REG(BPF_REG_0, BPF_REG_10),
++			BPF_JMP_A(0),
++			BPF_EXIT_INSN(),
++		},
++		.fixup_map1 = { 3 },
++		.result = ACCEPT,
++		.result_unpriv = REJECT,
++		.errstr_unpriv = "R0 leaks addr as return value"
++	},
++	{
++		"pointer/scalar confusion in state equality check (way 2)",
++		.insns = {
++			BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
++			BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
++			BPF_LD_MAP_FD(BPF_REG_1, 0),
++			BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
++				     BPF_FUNC_map_lookup_elem),
++			BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 2),
++			BPF_MOV64_REG(BPF_REG_0, BPF_REG_10),
++			BPF_JMP_A(1),
++			BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 0),
++			BPF_EXIT_INSN(),
++		},
++		.fixup_map1 = { 3 },
++		.result = ACCEPT,
++		.result_unpriv = REJECT,
++		.errstr_unpriv = "R0 leaks addr as return value"
++	},
+ 	{
+ 		"variable-offset ctx access",
+ 		.insns = {
+@@ -6597,6 +7049,71 @@ static struct bpf_test tests[] = {
+ 		.result = REJECT,
+ 		.prog_type = BPF_PROG_TYPE_LWT_IN,
+ 	},
++	{
++		"indirect variable-offset stack access",
++		.insns = {
++			/* Fill the top 8 bytes of the stack */
++			BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
++			/* Get an unknown value */
++			BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1, 0),
++			/* Make it small and 4-byte aligned */
++			BPF_ALU64_IMM(BPF_AND, BPF_REG_2, 4),
++			BPF_ALU64_IMM(BPF_SUB, BPF_REG_2, 8),
++			/* add it to fp.  We now have either fp-4 or fp-8, but
++			 * we don't know which
++			 */
++			BPF_ALU64_REG(BPF_ADD, BPF_REG_2, BPF_REG_10),
++			/* dereference it indirectly */
++			BPF_LD_MAP_FD(BPF_REG_1, 0),
++			BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
++				     BPF_FUNC_map_lookup_elem),
++			BPF_MOV64_IMM(BPF_REG_0, 0),
++			BPF_EXIT_INSN(),
++		},
++		.fixup_map1 = { 5 },
++		.errstr = "variable stack read R2",
++		.result = REJECT,
++		.prog_type = BPF_PROG_TYPE_LWT_IN,
++	},
++	{
++		"direct stack access with 32-bit wraparound. test1",
++		.insns = {
++			BPF_MOV64_REG(BPF_REG_1, BPF_REG_10),
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 0x7fffffff),
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 0x7fffffff),
++			BPF_MOV32_IMM(BPF_REG_0, 0),
++			BPF_STX_MEM(BPF_B, BPF_REG_1, BPF_REG_0, 0),
++			BPF_EXIT_INSN()
++		},
++		.errstr = "fp pointer and 2147483647",
++		.result = REJECT
++	},
++	{
++		"direct stack access with 32-bit wraparound. test2",
++		.insns = {
++			BPF_MOV64_REG(BPF_REG_1, BPF_REG_10),
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 0x3fffffff),
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 0x3fffffff),
++			BPF_MOV32_IMM(BPF_REG_0, 0),
++			BPF_STX_MEM(BPF_B, BPF_REG_1, BPF_REG_0, 0),
++			BPF_EXIT_INSN()
++		},
++		.errstr = "fp pointer and 1073741823",
++		.result = REJECT
++	},
++	{
++		"direct stack access with 32-bit wraparound. test3",
++		.insns = {
++			BPF_MOV64_REG(BPF_REG_1, BPF_REG_10),
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 0x1fffffff),
++			BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 0x1fffffff),
++			BPF_MOV32_IMM(BPF_REG_0, 0),
++			BPF_STX_MEM(BPF_B, BPF_REG_1, BPF_REG_0, 0),
++			BPF_EXIT_INSN()
++		},
++		.errstr = "fp pointer offset 1073741822",
++		.result = REJECT
++	},
+ 	{
+ 		"liveness pruning and write screening",
+ 		.insns = {
+diff --git a/tools/testing/selftests/x86/ldt_gdt.c b/tools/testing/selftests/x86/ldt_gdt.c
+index 2afc41a3730f..66e5ce5b91f0 100644
+--- a/tools/testing/selftests/x86/ldt_gdt.c
++++ b/tools/testing/selftests/x86/ldt_gdt.c
+@@ -137,30 +137,51 @@ static void check_valid_segment(uint16_t index, int ldt,
+ 	}
+ }
+ 
+-static bool install_valid_mode(const struct user_desc *desc, uint32_t ar,
+-			       bool oldmode)
++static bool install_valid_mode(const struct user_desc *d, uint32_t ar,
++			       bool oldmode, bool ldt)
+ {
+-	int ret = syscall(SYS_modify_ldt, oldmode ? 1 : 0x11,
+-			  desc, sizeof(*desc));
+-	if (ret < -1)
+-		errno = -ret;
++	struct user_desc desc = *d;
++	int ret;
++
++	if (!ldt) {
++#ifndef __i386__
++		/* No point testing set_thread_area in a 64-bit build */
++		return false;
++#endif
++		if (!gdt_entry_num)
++			return false;
++		desc.entry_number = gdt_entry_num;
++
++		ret = syscall(SYS_set_thread_area, &desc);
++	} else {
++		ret = syscall(SYS_modify_ldt, oldmode ? 1 : 0x11,
++			      &desc, sizeof(desc));
++
++		if (ret < -1)
++			errno = -ret;
++
++		if (ret != 0 && errno == ENOSYS) {
++			printf("[OK]\tmodify_ldt returned -ENOSYS\n");
++			return false;
++		}
++	}
++
+ 	if (ret == 0) {
+-		uint32_t limit = desc->limit;
+-		if (desc->limit_in_pages)
++		uint32_t limit = desc.limit;
++		if (desc.limit_in_pages)
+ 			limit = (limit << 12) + 4095;
+-		check_valid_segment(desc->entry_number, 1, ar, limit, true);
++		check_valid_segment(desc.entry_number, ldt, ar, limit, true);
+ 		return true;
+-	} else if (errno == ENOSYS) {
+-		printf("[OK]\tmodify_ldt returned -ENOSYS\n");
+-		return false;
+ 	} else {
+-		if (desc->seg_32bit) {
+-			printf("[FAIL]\tUnexpected modify_ldt failure %d\n",
++		if (desc.seg_32bit) {
++			printf("[FAIL]\tUnexpected %s failure %d\n",
++			       ldt ? "modify_ldt" : "set_thread_area",
+ 			       errno);
+ 			nerrs++;
+ 			return false;
+ 		} else {
+-			printf("[OK]\tmodify_ldt rejected 16 bit segment\n");
++			printf("[OK]\t%s rejected 16 bit segment\n",
++			       ldt ? "modify_ldt" : "set_thread_area");
+ 			return false;
+ 		}
+ 	}
+@@ -168,7 +189,15 @@ static bool install_valid_mode(const struct user_desc *desc, uint32_t ar,
+ 
+ static bool install_valid(const struct user_desc *desc, uint32_t ar)
+ {
+-	return install_valid_mode(desc, ar, false);
++	bool ret = install_valid_mode(desc, ar, false, true);
++
++	if (desc->contents <= 1 && desc->seg_32bit &&
++	    !desc->seg_not_present) {
++		/* Should work in the GDT, too. */
++		install_valid_mode(desc, ar, false, false);
++	}
++
++	return ret;
+ }
+ 
+ static void install_invalid(const struct user_desc *desc, bool oldmode)
+diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
+index 484e8820c382..2447d7c017e7 100644
+--- a/virt/kvm/kvm_main.c
++++ b/virt/kvm/kvm_main.c
+@@ -4018,7 +4018,7 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
+ 	if (!vcpu_align)
+ 		vcpu_align = __alignof__(struct kvm_vcpu);
+ 	kvm_vcpu_cache = kmem_cache_create("kvm_vcpu", vcpu_size, vcpu_align,
+-					   0, NULL);
++					   SLAB_ACCOUNT, NULL);
+ 	if (!kvm_vcpu_cache) {
+ 		r = -ENOMEM;
+ 		goto out_free_3;